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PREFACE 


Several years ago while reading Weil’s Number Theory: An Approach 
Through History, 1 noticed a conjecture of Euler concerning primes of the 
form x* + 14y?. That same week I picked up Cohn’s A Classical Invitation 
to Algebraic Numbers and Class Fields and saw the same example treated 
from the point of view of the Hilbert class field. The coincidence made it 
clear that something interesting was going on, and this book is my attempt 
to tell the story of this wonderful part of mathematics. 

I am an algebraic geometer by training, and number theory has always 
been more of an avocation than a profession for me. This will help explain 
some of the curious omissions in the book. There may also be errors of his- 
tory or attribution (for which I take full responsibility), and doubtless some 
of the proofs can be improved. Corrections and comments are welcome! 

I would like to thank my colleagues in the number theory seminars 
of Oklahoma State University and the Five Colleges (Amherst College, 
Hampshire College, Mount Holyoke College, Smith College and the Uni- 
versity of Massachusetts) for the opportunity to present material from this 
book in preliminary form. Special thanks go to Dan Flath and Peter Nor- 
man for their comments on earlier versions of the manuscript. I am also 
grateful to the reference librarians at Amherst College and Oklahoma State 
University for their help in obtaining books through interlibrary loan. 


DAVID A. Cox 


Amherst, Massachusetts 
August 1989 


NOTATION 


The following standard notation will be used throughout the book. 


GL(2, R) 
SL(2, R) 
Gal(L/K) 


The integers. 

The rational numbers. 

The real numbers. 

The complex numbers. 

The upper half plane {x +iy EC: y > 0}. 

The ring of integers modulo n. 

The coset of a € A in the quotient A/B. 

The group of units in a commutative ring R with identity. 
The group of invertible matrices (4 %), a,b,c,d € R. 

The subgroup of GL(2, R) of matrices with determinant 1. 
The Galois group of the field extension K C L. 

The ring of algebraic integers in a finite extension K of Q. 
The standard primitive nth root of unity. 

The set {ma +nb:m,neéZ}. 

The greatest common divisor of the integers a and b. 
The number of elements in a finite set S. 

The end of a proof or the absence of a proof. 
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PRIMES OF THE FORM x? + ny? 


INTRODUCTION 


Most first courses in number theory or abstract algebra prove a theorem of 
Fermat which states that for an odd prime p, 


p=x' ty’, x,yeEZ <> p=1mod4. 


This is only the first of many related results that appear in Fermat’s works. 
For example, Fermat also states that if p is an odd prime, then 


p=xt+2y’, xyEZ <> p=l,3 mod 8 
p=x'?+3y’, x,yeZ <> p=3 or p=1 mod3. 


These facts are lovely in their own right, but they also make one curious 
to know what happens for primes of the form x? + 5y’, x? + 6y?, etc. This 
leads to the basic question of the whole book, which we formulate as fol- 
lows: 


Basic Question 0.1. Given a positive integer n, which primes p can be ex- 
pressed in the form 
p=x?+ny? 


where x and y are integers? 
We will answer this question completely, and along the way we will en- 


counter some remarkably rich areas of number theory. The first steps will 
be easy, involving only quadratic reciprocity and the elementary theory of 
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quadratic forms in two variables over Z. These methods work nicely in the 
special cases considered above by Fermat. Using genus theory and cubic 
and biquadratic reciprocity, we can treat some more cases, but elementary 
methods fail to solve the problem in general. To proceed further, we need 
class field theory. This provides an abstract solution to the problem, but 
doesn’t give explicit criteria for a particular choice of n in x* + ny”. The 
final step uses modular functions and complex multiplication to show that 
for a given n, there is an algorithm for answering our question of when 
p=x*+ny?. 

This book has several goals. The first, to answer the basic question, has 
already been stated. A second goal is to bridge the gap between elementary 
number theory and class field theory. Although our basic question is simple 
enough to be stated in any beginning course in number theory, we will see 
that its solution is intimately bound up with higher reciprocity laws and class 
field theory. A related goal is to provide a well-motivated introduction to 
the classical formulation of class field theory. This will be done by carefully 
Stating the basic theorems and illustrating their power in various concrete 
situations. 

Let us summarize the contents of the book in more detail. We begin in 
Chapter One with the more elementary approaches to the problem, using 
the works of Fermat, Euler, Lagrange, Legendre and Gauss as a guide. In 
81, we will give Euler’s proofs of the above theorems of Fermat for primes 
of the form x? + y*, x* + 2y” and x? + 3y”, and we will see what led Euler 
to discover quadratic reciprocity. We will also discuss the conjectures Euler 
made concerning p = x* + ny” for n > 3. Some of these conjectures, such 
as 


(0.2) p=x*+5y* <>» p=1,9 mod 20, 
are Similar to Fermat’s theorems, while others, like 


p =1 mod 3 and 2 isa 


p=x+2Ty? <> 
cubic residue modulo p, 

are quite unexpected. For later purposes, note that this conjecture can be 

written in the following form: 


p =1mod 3 and x3 =2 mod p 
(0.3) p=x?+27Ty* <> 
has an integer solution. 


In §2, we will study Lagrange’s theory of positive definite quadratic 
forms. After introducing the basic concepts of reduced form and class num- 
ber, we will develop an elementary form of genus theory which will enable 
us to prove (0.2) and similar theorems. Unfortunately, for cases like (0.3), 
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genus theory can only prove the partial result that 
x? + 27Ty? 
(0.4) p= or <=> p=1mod 3. 
4x? +2xy + Ty? 


The problem is that x? + 27y? and 4x2 + 2xy + 7y? lie in the same genus 
and hence can’t be separated by simple congruences. We will also discuss 
Legendre’s tentative attempts at a theory of composition. 

While the ideas of genus theory and composition were already present in 
the works of Lagrange and Legendre, the real depth of these theories wasn’t 
revealed until Gauss came along. In §3 we will present some basic results 
in Gauss’ Disquisitiones Arithmeticae, and in particular we will study the 
remarkable relationship between genus theory and composition. But for our 
purposes, the real breakthrough came when Gauss used cubic reciprocity to 
prove Euler’s conjecture (0.3) concerning p = x? + 27y?. In §4 we will give 
a careful statement of cubic reciprocity, and we will explain how it can be 
used to prove (0.3). Similarly, biquadratic reciprocity can be used to answer 
our question for x* + 64y?. We will see that Gauss clearly recognized the 
role of higher reciprocity laws in separating forms of the same genus. This 
section will also begin our study of algebraic integers, for in order to state 
cubic and biquadratic reciprocity, we must first understand the arithmetic 
of the rings Z[e?™’/>] and Z[i]. 

To go further requires class field theory, which is the topic of Chapter 
Two. We will begin in §5 with the Hilbert class field, which is the maximal 
unramified Abelian extension of a given number field. This will enable us 
to prove the following general result: 


Theorem 0.5. Let n = 1,2 mod 4 be a positive squarefree integer. Then there 
is an irreducible polynomial f,(x) € Z[x] such that for a prime p dividing 
neither n nor the discriminant of f,(x), 


(—n/p)=1and f,(x)=0 mod p 


pH=x*tny? <> 
has an integer solution. 
While the statement of Theorem 0.5 is elementary, the polynomial f,,(x) is 
quite sophisticated: it is the minimal polynomial of a primitive element of 
the Hilbert class field L of K = Q(./—n). 
As an example of this theorem, we will study the case n = 14. We will 
show that the Hilbert class field of K = Q(/V—14) is L = K(q), where a = 


V 22-1. By Theorem 0.5, this will show that for an odd prime p, 
(—14/p) = 1 and (x? + 1)? =8 mod p 


(0.6) pH=x'+14y* <> 
has an integer solution, 
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which answers our basic question for x* + 14y”. The Hilbert class field will 
also enable us in 86 to give new proofs of the main theorems of genus 
theory. 

The theory sketched so far is very nice, but there are some gaps in it. 
The most obvious is that the above results for x*+27y? and x? + 14y? 
((0.3) and (0.6) respectively) both follow the same format, but (0.3) does 
not follow from Theorem 0.5, for n = 27 is not squarefree. There should be 
a unified theorem that works for all positive n, yet the proof of Theorem 
0.5 breaks down for general n because Z[,/—n] is not in general the full 
ring of integers in Q(./—7). 

The goal of §§7-9 is to show that Theorem 0.5 holds for all positive 
integers n. This, in fact, is the main theorem of the whole book. In §7 
we will study the rings Z[,/—n] for general n, which leads to the concept 
of an order in an imaginary quadratic field. In §8 we will summarize the 
main theorems of class field theory and the Cebotarev Density Theorem, 
and in §9 we will introduce a generalization of the Hilbert class field called 
the ring class field, which is a certain (possibly ramified) Abelian extension 
of Q(./—n) determined by the order Z[,/—n]. Then, in Theorem 9.2, we 
will use the Artin Reciprocity Theorem to show that Theorem 0.5 holds for 
all n >0, where the polynomial f,,(x) is now the minimal polynomial of a 
primitive element of the above ring class field. To give a concrete example 
of what this means, we will apply Theorem 9.2 to the case x* + 27y?, which 
will give us a class field theory proof of (0.3). In §§8 and 9 we will also 
discuss how class field theory is related to higher reciprocity theorems. 

The major drawback to the theory presented in 89 is that it is not con- 
structive: for a given n > 0, we have no idea how to find the polynomial 
fn(x). From (0.3) and (0.6), we know f27(x) and fi4(x), but the methods 
used in these examples hardly generalize. Chapter Three will use the the- 
ory of complex multiplication to remedy this situation. In §10 we will study 
elliptic functions and introduce the idea of complex multiplication, and then 
in §11 we will discuss modular functions and show that the j-function can 
be used to generate ring class fields. As an example of the wonderful for- 
mulas that can be proved, in §12 we will give Weber’s computation that 


ji(V-14) = 23 (323 + 228V2 + (231 + 161v2) V/ 2V2- 1). 


These methods will also enable us to prove the Baker—-Heegner-Stark The- 
orem On imaginary quadratic fields of class number 1. The final section of 
the book will discuss the class equation, which is the minimal polynomial 
of j(./—n). We will learn how to compute the class equation, and this in 
turn will lead to a constructive solution of p = x? + ny*. We will then de- 
scribe some more recent work by Deuring and by Gross and Zagier. In 1946 
Deuring proved a result about the difference of singular j-invariants, which 
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implies an especially elegant version of our main theorem, and drawing on 
Deuring’s work, Gross and Zagier discovered yet more remarkable proper- 
ties of the class equation. The book will end with a discussion of elliptic 
curves and an application of the class equation to primality testing. 

Number theory is usually taught at three levels, as an undergraduate 
course, a beginning graduate course, or a more advanced graduate course. 
These levels correspond roughly to the three chapters of the book. Chapter 
One requires only beginning number theory (up to quadratic reciprocity) 
and a semester of abstract algebra. Since the proofs of quadratic, cubic 
and biquadratic reciprocity are omitted, this book would be best suited as 
a supplementary text in a beginning course. For Chapter Two, the reader 
should know Galois theory and some basic facts about algebraic number 
theory (these are reviewed in §5), but no previous exposure to class field 
theory is assumed. The theorems of class field theory are stated without 
proof, so that this book would be most useful as a supplement to the topics 
covered in a first graduate course. Chapter Three requires a knowledge 
of complex analysis, but otherwise it is self-contained. (Brief but complete 
accounts of the Weierstrass g-function and modular functions are included 
in §$10 and 11.) This portion of the book should be suitable for use in a 
graduate seminar. 

There are exercises at the end of each section, many of which consist of 
working out the details of arguments sketched in the text. Readers learning 
this material for the first time should find the exercises to be useful, while 
more sophisticated readers may skip them without loss of continuity. 

Many important (and relevant) topics are not covered in the book. An 
obvious omission in Chapter One concerns forms such as x* — 2y*, which 
were certainly considered by Fermat and Euler. Questions of this sort lead 
to Pell’s equation and the class field theory of real quadratic fields. We 
have also ignored the problem of representing arbitrary integers, not just 
primes, by quadratic forms, and there are interesting questions to ask about 
the number of such representations (this material is covered in Grosswald’s 
recent book [47]). In Chapter Two we do not discuss adeles or ideles—we 
give only a classical formulation of class field theory. For a more modern 
treatment, see either Neukirch [80] or Weil [104]. We also do not do justice 
to the use of analytic methods in number theory. For a nice introduction 
in the case of quadratic fields, see Zagier [111]. Our treatment of elliptic 
curves is rather incomplete. See Husemdller [58] or Silverman [93] for the 
basic theory, while more advanced topics are covered by Lang [73] and 
Shimura [90]. 

There are many books which touch on the number theory encountered 
in studying the problem of representing primes by x? + ny”. Four books 
that we particularly recommend are Cohn’s A Classical Invitation to Alge- 
braic Numbers and Class Fields [19], Lang’s Elliptic Functions [73], Scharlau 
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and Opolka’s From Fermat to Minkowski [86], and Weil’s Number Theory: 
An Approach Through History [106]. These books, as well as others to be 
found in the bibliography, open up an extraordinarily rich area of mathe- 
matics. The purpose of this book is to reveal some of this richness and to 
encourage the reader to learn more about it. 


CHAPTER ONE 


FROM FERMAT TO GAUSS 


§1. FERMAT, EULER AND QUADRATIC RECIPROCITY 


In this section we will discuss primes of the form x? +ny?, where n is 
a fixed positive integer. Our starting point will be the three theorems of 
Fermat 


p=x'+y’, xyEeZ<— p=1mod4 
(1.1) p=x'?+2y*, xyeZ <> p=1or3mod8 
p=x'+3y’, xyEZ<— p=30r p=1mod3 


mentioned in the introduction. The goals of §1 are to prove (1.1) and, 
more importantly, to get a sense of what’s involved in studying the equation 
p=x’+ny? when n> 0 is arbitrary. This last question was best answered 
by Euler, who spent 40 years proving Fermat’s theorems and thinking about 
how they can be generalized. Our exposition will follow some of Euler’s 
papers closely, both in the theorems proved and in the examples studied. 
We will see that Euler’s strategy for proving (1.1) was one of the primary 
things that led him to discover quadratic reciprocity, and we will also dis- 
cuss some of his conjectures concerning p = x? + ny* for n> 3. These re- 
markable conjectures touch on quadratic forms, composition, genus theory, 
cubic and biquadratic reciprocity, and will keep us busy for the rest of the 
chapter. 
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A. Fermat 


Fermat’s first mention of p = x* + y” occurs in a 1640 letter to Mersenne 
[35, Vol. II, p. 212], while p = x* + 2y? and p = x” + 3y” come later, first 
appearing in a 1654 letter to Pascal [35, Vol. II, pp. 310-314]. Although 
no proofs are given in these letters, Fermat states the results as theorems. 
Writing to Digby in 1658, he repeats these assertions in the following form: 


Every prime number which surpasses by one a multiple of four is 
composed of two squares. Examples are 5, 13, 17, 29, 37, 41, etc. 


Every prime number which surpasses by one a multiple of three is 
composed of a square and the triple of another square. Examples are 
7, 13, 19, 31, 37, 43, etc. 

Every prime number which surpasses by one or three a multiple 
of eight is composed of a square and the double of another square. 
Examples are 3, 11, 17, 19, 41, 43, etc. 


Fermat adds that he has solid proofs—“firmissimis demonstratibus” [35, 
Vol. IT, pp. 402-408 (Latin), Vol. III, pp. 314-319 (French)]. 

The theorems (1.1) are only part of the work that Fermat did with x? + 
ny”. For example, concerning x* + y?, Fermat knew that a positive integer 
N is the sum of two squares if and only if the quotient of N by its largest 
square factor is a product of primes congruent to 1 modulo 4 [35, Vol. III, 
Obs. 26, pp. 256-257], and he knew the number of different ways N can 
be so represented [35, Vol. III, Obs. 7, pp. 243-246]. Fermat also studied 
forms beyond x? + y*, x2+2y? and x? +3y?. For example, in the 1658 
letter to Digby quoted above, Fermat makes the following conjecture about 
x? + 5y*, which he admits he can’t prove: 


If two primes, which end in 3 or 7 and surpass by three a multi- 
ple of four, are multiplied, then their product will be composed of a 
square and the quintuple of another square. 


Examples are the numbers 3, 7, 23, 43, 47, 67, etc. Take two of 
them, for example 7 and 23; their product 161 is composed of a square 
and the quintuple of another square. Namely 81, a square, and the 
quintuple of 16 equal 161. 


Fermat’s condition on the primes is simply that they be congruent to 3 or 7 
modulo 20. In §2 we will present Lagrange’s proof of this conjecture, which 
uses ideas from genus theory and the composition of forms. 

Fermat’s proofs used the method of infinite descent, but that’s often all 
he said. As an example, here is Fermat’s description of his proof for p = 
x? + y? [35, Vol. II, p. 432]: 
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If an arbitrarily chosen prime number, which surpasses by one a 
multiple of four, is not a sum of two squares, then there is a prime 
number of the same form, less than the given one, and then yet a third 
still less, etc., descending infinitely until you arrive at the number 5, 
which is the least of all of this nature, from which it would follow 
was not the sum of two squares. From this one must infer, by deduc- 
tion of the impossible, that all numbers of this form are consequently 
composed of two squares. 


This explains the philosophy of infinite descent, but doesn’t tell us how 
to produce the required lesser prime. In fact, we have only one complete 
proof by Fermat. It occurs in one of his marginal notes (the area of a right 
triangle with integral sides cannot be an integral square [35, Vol. III, Obs. 
45, pp. 271-272]—for once the margin was big enough!). The methods of 
this proof (see Weil [106, p. 77] or Edwards [31, pp. 10-14] for modern 
expositions) do not apply to our case, so that we are still in the dark. In 
his recent book [106], Weil makes a careful study of Fermat’s letters and 
marginal notes, and with some hints from Euler, he reconstructs some of 
Fermat’s proofs. Weil’s arguments are quite convincing, but we won’t go 
into them here. For the present, we prefer to leave things as Euler found 
them, i.e., wonderful theorems but no proofs. 


B. Euler 


Euler first heard of Fermat’s results through his correspondence with Gold- 
bach. In fact, Goldbach’s first letter to Euler, written in December 1729, 
mentions Fermat’s conjecture that 2° + 1 is always prime [40, p. 10]. Shortly 
thereafter, Euler read some of Fermat’s letters that had been printed in 
Wallis’ Opera [100] (which included the one to Digby quoted above). Euler 
was intrigued by what he found. For example, writing to Goldbach in June 
1730, Euler comments that Fermat’s four-square theorem (every positive 
integer is a sum of four or fewer squares) is a “non inelegans theorema” 
[40, p. 24]. For Euler, Fermat’s assertions were serious theorems deserving 
of proof, and finding the proofs became a life-long project. Euler’s first pa- 
per on number theory, written in 1732 at age 25, disproves Fermat’s claim 
about 2?" + 1 by showing that 641 is a factor of 27* + 1 [33, Vol. II, pp. 1- 
5]. Euler’s interest in number theory continued unabated for the next 51 
years—there was a steady stream of papers introducing many of the fun- 
damental concepts of number theory, and even after his death in 1783, his 
papers continued to appear until 1830 (see [33, Vol. IV-V]). Weil’s book 
[106] gives a detailed survey of Euler’s work on number theory (other ref- 
erences are Burkhardt [14], Edwards [31, Chapter 2], Scharlau and Opolka 
[86, Chapter 3], and the introductions to Volumes II-V of Euler’s collected 
works [33]). 
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We can now present Euler’s proof of the first of Fermat’s theorems from 


(1.1): 


Theorem 1.2. An odd prime p can be written as x? + y? if and only if p= 
1 mod 4. 


Proof. If p = x* + y*, then congruences modulo 4 easily imply that p= 
1 mod 4. The hard work is proving the converse. We will give a modern 
version of Euler’s proof. Given an odd prime p, there are two basic steps 
to be proved: 


Descent Step: If p | x2 + y”, gcd(x, y) = 1, then p can be written as x? + y?. 
Reciprocity Step: If p = 1 mod 4, then p | x? + y*, gcd(x,y) = 1. 


It will soon become clear why we use the names “Descent” and “Reci- 
procity.” 

We'll do the Descent Step first since that’s what happened historically. 
The argument below is taken from a 1747 letter to Goldbach [40, pp. 416- 
419] (see also [33, Vol. II, pp. 295-327]). We begin with the classical identity 


(1.3) (x? + y*)(z2 + w?) = (xzt yw)? +(xw Fz yzy 
(see Exercise 1.1) which enables one to express composite numbers as sums 


of squares. The key observation is the following lemma: 


Lemma 1.4. Suppose that N is a sum of two relatively prime squares, and 
that q = x? + y? is a prime divisor of N. Then N/q is also a sum of two 
relatively prime squares. 


Proof. Write N = a* + b?, where a and b are relatively prime. We also have 
g = x* +’, and thus g divides 
x°N —a*q= x?(a? +: b’) - a(x? + y’) 
= x°b? — a*y? = (xb— ay)(xb + ay). 


Since qg is prime, it divides one of these two factors, and changing the sign 
of a if necessary, we can assume that g|xb-—ay. Thus xb-—ay = dq for 
some integer d. 

We claim that x|a+dy. Since x and y are relatively prime, this is 
equivalent to x | (a +dy)y. However, 


(a+dy)y = ay + dy? = xb—dq + dy? 
= xb— d(x? + y*)+ dy’ = xb— dx’, 
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which is obviously divisible by x. Furthermore, if we set a+dy =cx, then 
the above equation implies that b = dx + cy. Thus we have 


=cx-—dy 


1.5 
oe b=dx+t+cy. 


Then, using (1.3), we obtain 
N =a’ +b? =(cx—dy) +(dx+cyy 
= (x* + y*\(c* + d*) = q(c* +d’). 


Thus N/q = c? + d? is a sum of squares, and (1.5) shows that c and d must 
be relatively prime since a and b are. This proves the lemma. Q.E.D. 


To complete the proof of the Descent Step, let p be an odd prime di- 
viding N = a?+b*, where a and b are relatively prime. If a and b are 
changed by multiples of p, we still have p| a? + b*. We may thus assume 
that |a| < p/2 and |b] < p/2, which in turn implies that N < p*/2. The new 
a and b may have a greatest common divisor d > 1, but p doesn’t divide d, 
so that dividing a and b by d, we may assume that p|_N, N < p*/2, and 
N =a’ +b* where gcd(a,b) = 1. Then all prime divisors q # p of N are 
less than p. If q were a sum of two squares, then Lemma 1.4 would show 
that N/q would be a multiple of p, which is also a sum of two squares. 
If all such q’s were sums of two squares, then repeatedly applying Lemma 
1.4 would imply that p itself was of the same form. So if p is not a sum of 
two squares, there must be a smaller prime q with the same property. Since 
there is nothing to prevent us from repeating this process indefinitely, we 
get an infinite decreasing sequence of prime numbers. This contradiction 
finishes the Descent Step. 

This is a classical descent argument, and as Weil argues [106, pp. 68-69], 
it is probably similar to what Fermat did. In §2 we will take another ap- 
proach to the Descent Step, using the reduction theory of positive definite 
quadratic forms. 

The Reciprocity Step caused Euler a lot more trouble, taking him until 
1749. Euler was clearly relieved when he could write to Goldbach “Now 
have | finally found a valid proof” [40, pp. 493-495]. The basic idea is quite 
simple: since p = 1 mod 4, we can write p = 4k +1. Then Fermat’s Little 
Theorem implies that 


(x7* — 1)(x* + 1) = x —1=0 mod p 


for all x #0 mod p. If x** -140 mod p for one such x, then p|x* +1, 
so that p divides a sum of relatively prime squares, as desired. For us, the 
required x is easy to find, since x* — 1 is a polynomial over the field Z/pZ 
and hence has at most 2k < p —1 roots. Euler’s first proof is quite different, 
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for it uses the calculus of finite differences—see Exercise 1.2 for details. 
This proves Fermat’s claim (1.1) for primes of the form x7+y?. Q.E.D. 


Euler used the same two-step strategy in his proofs for x? +2y? and 
x? + 3y*. The Descent Steps are 


If p| x7 + Dye. gcd(x,y) = 1, then p is of the form x Dy" 
If p| x? +3y’, gcd(x, y) = 1, then p is of the form x? + 3y”, 
and the Reciprocity Steps are 
If p = 1,3 mod 8, then p| x” + 2y’, gcd(x,y) = 1 
If p =1 mod 3, then p| x? +3y’, gcd(x,y) = 1, 


where p is always an odd prime. In each case, the Reciprocity Step was 
harder to prove than the Descent Step, and Euler didn’t succeed in giving 
complete proofs of Fermat’s theorems (1.1) until 1772, 40 years after he 
first read about them. Weil discusses the proofs for x” + 2y* and x? + 3y? 
in [106, pp. 178-179, 191, and 210-212], and in Exercises 1.4 and 1.5 we will 
present a version of Euler’s argument for x* + 3y?. 


C. p= x? +ny* and Quadratic Reciprocity 


Let’s turn to the general case of p = x? + ny”, where n is now any positive 
integer. To study this problem, it makes sense to start with Euler’s two-step 
strategy. This won’t lead to a proof, but the Descent and Reciprocity Steps 
will both suggest some very interesting questions for us to pursue. 

The Descent Step for arbitrary n > 0 begins with the identity 


(1.6) (x? + ny*)(z? + nw’) = (xztnywy + n(xw F yzy 


(see Exercise 1.1), and Lemma 1.4 generalizes easily for n > 0 (see Exercise 
1.3). Then suppose that p|x?+ny?. As in the proof of the Descent Step 
in Theorem 1.2, we can assume that |x|,|y| < p/2. For n<3, it follows 
that x* + ny? < p* when p is odd, and then the argument from Theorem 
1.2 shows that p is of the form x* + ny? (see Exercise 1.4). One might 
conjecture that this holds in general, ie., that p |x? +ny* always implies 
p =x’ +ny?’. Unfortunately this fails even for n = 5: for example, 3 | 21 = 
1° +5-2? but 3 A x? + 5y”. Euler knew this, and most likely so did Fermat 
(remember his speculations about x? + Sy”). So the question becomes: how 
are prime divisors of x* + ny” to be represented? As we will see in §2, the 
proper language for this is Lagrange’s theory of quadratic forms, and in 
particular a complete solution to the Descent Step will follow from the 
properties of reduced forms. 
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Turning to the Reciprocity Step for n > 0, the general case asks for con- 
gruence conditions on a prime p which will guarantee p | x* + ny. To see 
what kind of congruences we need, note that the conditions of (1.1) can 
be unified by working modulo 4n. Thus, given n >0, we’re looking for 
a congruence of the form p =a,{,... mod 4n which implies p | x* + ny”, 
gcd(x, y) = 1. To give a modern formulation of this last condition, we first 
define the Legendre symbol (a/p). If a is an integer and p an odd prime, 
then 


0 pla 
(5) = 1 Pj} a and a is a quadratic residue modulo p 


—1 pj a and a is a quadratic nonresidue modulo p. 


We can now restate p | x? + ny” as follows: 


Lemma 1.7, Let n be a nonzero integer, and let p be an odd prime not 
dividing n. Then 


p\x?+ny’, gcd(x,y)=1 <> (=) = 1. 


Proof. The basic idea is that if x? + ny? =0 mod p and gcd(x,y) = 1, then 

y must be relatively prime to p and consequently has a multiplicative in- 

verse modulo p. The details are left to the reader (see Exercise 1.6). 
Q.E.D. 


The arguments of the above lemma are quite elementary, but for Euler 
they were not so easy—he first had to realize that quadratic residues were 
at the heart of the matter. This took several years, and it’s fun to watch 
his terminology evolve: in 1744, he writes “prime divisors of numbers of 
the form aa — Nbb” [33, Vol. II, p. 216]; by 1747 this changes to “residues 
arising from the division of squares by the prime p” [33, Vol. II, p. 313]; 
and by 1751 the transition is complete—Euler now uses the terms “residua” 
and “non-residua” freely, with the “quadratic” being understood [33, Vol. II, 
p. 343]. 

Using Lemma 1.7, the Reciprocity Step can be restated as the following 
question: is there a congruence p =a,f3,... mod 4n which implies (—n/p) 
= 1 when p is prime? This question also makes sense when n < 0, and in 
the following discussion n will thus be allowed to be positive or negative. 
We will see in Corollary 1.19 that the full answer is intimately related to 
the law of quadratic reciprocity, and in fact the Reciprocity Step was one of 
the primary things that led Euler to discover quadratic reciprocity. 

Euler became intensely interested in this question in the early 1740s, and 
he mentions numerous examples in his letters to Goldbach. In 1744 Euler 
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collected together his examples and conjectures in the paper Theoremata 
circa divisores numerorum in hac forma paa+qbb contentorum [33, Vol. 
II, pp. 194-222]. He labels his examples as “theorems,” but they are really 
“theorems found by induction,” which is eighteenth-century parlance for 
conjectures based on working out some particular cases. Here are of some 
of Euler’s conjectures, stated in modern notation: 


( 


1 <> p=+1 mod 12 


=1 <> p=+1,+11 mod 20 


| 
(1.8) 
| 
| 


7 
() =1 <> p=+1,43,+9 mod 28, 


where p is an odd prime not dividing n. In looking for a unifying pattern, 
the bottom three look more promising because of the +’s. If we rewrite the 
bottom half of (1.8) using 11 = —9 mod 20 and 3 = —25 mod 28, we obtain 


3 
() =1 <> p=+1mod 12 
5 
(=) =1 <> p=+1,+9 mod 20 


7 
(5) =1 <> p=+1,+425,+9 mod 28. 


All of the numbers that appear are odd squares! 
Before getting carried away, we should note another of Euler’s conjec- 
tures: 


6 
(=) =1 <> p=+41,+5 mod 24. 


Unfortunately, +5 is not a square modulo 24, and the same thing happens 
for (10/p) and (14/p) . But 3, 5 and 7 are prime, while 6, 10 and 14 are 
composite. Thus it makes sense to make the following conjecture for the 
prime case: 
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Conjecture 1.9. If p and gq are distinct odd primes, then 
(4) =1 <> p=+/? mod 4g for some odd integer B . 


The remarkable fact is that this conjecture is equivalent to the usual state- 
ment of quadratic reciprocity: 


Proposition 1.10. Jf p and q are distinct odd primes, then Conjecture 1.9 is 


equivalent to 
(4) (4) = (-10-Da-0/4, 
q/\P 
Proof. Let p* = (—1)”~/? p. Then the standard properties 
—1 
(G)-— 
(5) GG) 
Pp P/\P 


of the Legendre symbol easily imply that quadratic reciprocity is equivalent 
to 


0 )-() 


(see Exercise 1.7). Since both sides are +1, it follows that quadratic reci- 


procity can be stated as 
(4) =1 = (=) =1 
P q 


Comparing this to Conjecture 1.9, we see that it suffices to show 


(1.11) 


(1.13) (=) =1 <> p=+? mod 4q, B odd. 


The proof of (1.13) is straightforward and is left to the reader (see Exercise 
1.8). Q.E.D. 


With hindsight, we can see why Euler had trouble with the Reciprocity 
Steps for x*+2y? and x?+3y?: he was working out special cases of 
quadratic reciprocity! Exercise 1.9 will discuss which special cases were in- 
volved. We will not prove quadratic reciprocity in this section, but later in 
88 we will give a proof using class field theory. Proofs of a more elementary 
nature can be found in most number theory texts. 
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The discussion leading up to Conjecture 1.9 is pretty exciting, but was it 
what Euler did? The answer is yes and no. To explain this, we must look 
more closely at Euler’s 1744 paper. In addition to conjectures like (1.8), the 
paper also contained a series of Annotations where Euler speculated on 
what was happening in general. For simplicity, we will concentrate on the 
case of (N/p), where N > 0. Euler notes in Annotation 13 [33, Vol. II, p. 
216] that for such N’s, all of the conjectures have the form 


(e) =: <> p=+a mod 4N 


for certain odd values of a. Then in Annotation 16 [33, Vol. II, pp. 216- 
217], Euler states that “while 1 is among the values [of the a’s], yet likewise 
any square number, which is prime to 4N, furnishes a suitable value for a.” 
This is close to what we want, but it doesn’t say that the odd squares fill up 
all possible a’s when WN is prime. To see this, we turn to Annotation 14 
(33, Vol. I, p. 216], where Euler notes that the number of a’s that occur is 
(1/2)¢(N). When JN is prime, this equals (N — 1)/2, exactly the number of 
incongruent squares modulo 4N. Thus what Euler states is fully equivalent 
to Conjecture 1.9. In 1875, Kronecker identified these Annotations as the 
first complete statement of quadratic reciprocity [68, Vol. II, pp. 3-4]. 

The problem is that we have to read between the lines to get quadratic 
reciprocity—why didn’t Euler state it more explicitly? He knew that the 
prime case was special, for why else would he list the prime cases before 
the composite ones? The answer to this puzzle, as Weil points out [106, pp. 
207-209], is that Euler’s real goal was to characterize the a’s for all N, not 
just primes. To explain this, we need to give a modern description of the 
+a’s. The following lemma ts at the heart of the matter: 


Lemma 1.14. Jf D =0,1 mod 4 is a nonzero integer, then there is a unique 
homomorphism x :(Z2/DZ)* > {+1} such that y([p])=(D/p) for odd 
primes p not dividing D. Furthermore, 


] when D>0O 


—1l)l)= 
x=) —] when D <0. 

Proof. The proof will make extensive use of the Jacobi symbol. Given m > 
0 odd and relatively prime to M, recall that the Jacobi symbol (M/m) is 
defined to be the product 


C. p = x? +ny* AND QUADRATIC RECIPROCITY 17 


where m= p1---p, is the prime factorization of m. Note that (M/m) = 
(N/m) when M = WN mod m, and there are the multiplicative identities 


(a8) -(0)(8) 
5) 


(see Exercise 1.10). The Jacobi symbol also satisfies the following version 
of quadratic reciprocity: 


(=) = (1-92 


(1.16) (=) = (-1"-v/8 


M m 
— ) =(—1)4%-Den-1)/4 | 
(see Exercise 1.10). 


For this lemma, the crucial property of the Jacobi symbol is one usually 
not mentioned in elementary texts: if m =n mod D, where m and n are 
odd and positive and D = 0,1 mod 4, then 


en (2) (2) 


The proof is quite easy when D = 1 mod 4 and D > 0: using quadratic reci- 
procity (1.16), the two sides of (1.17) become 


_4\(D-1y(m-1)/4 ( 
(-1) ( 5) 


_1)2-pe-n/4( 7”). 
(-1) D 


To compare these, first note that the two Jacobi symbols are equal since 
m=nmod D. From D = 1 mod 4 we see that 


(D — 1)(m — 1)/4= (D -1)(4- 1)/4=0 mod 2 


(1.15) 


(1.18) 


since m and n are odd. Thus the signs in front of (1.18) are both +1, and 
(1.17) follows. When D is even or negative, a similar argument using the 
supplementary laws from (1.16) shows that (1.17) still holds (see Exercise 
1.11). 

It follows from (1.17) that y([m]) = (D/m) gives a well-defined homo- 
morphism from (Z/DZ)* to {+1} (see Exercise 1.12), and the statement 
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concerning y([—1]) follows from the above properties of the Jacobi sym- 
bol (see Exercise 1.12). Finally, the condition that y([p]) = (D/p) for p 
prime determines y uniquely follows because every class in (Z/DZ)* con- 
tains a prime—this is a consequence of Dirichlet’s theorem on primes in 
arithmetic progressions (to be proved in §8). Q.E.D. 


The above proof made heavy use of quadratic reciprocity, which is no 
accident: Lemma 1.14 is in fact equivalent to quadratic reciprocity and the 
supplementary laws (see Exercise 1.13). For us, however, the main feature 
of Lemma 1.14 is that it gives a complete solution of the Reciprocity Step 
of Euler’s strategy: 


Corollary 1.19. Let n be a nonzero integer, and let x : (Z/4nZ)* — {+1} be 
the homomorphism from Lemma 1.14 when D = —4n. If p is an odd prime 
not dividing n, then the following are equivalent: 


(i) p |x? + ny’, gced(x,y) =1. 
(ii) (—n/p)=1. 
(iii) [p] € ker(y) c (Z/4nZ)*. 


Proof. (i) and (ii) are equivalent by Lemma 1.7, and since (—4n/p) = 
(—n/p), (ii) and (iii) are equivalent by Lemma 1.14. Q.E.D. 


To see how this solves the Reciprocity Step, note that if ker(y) = 
{[a},[G],[7],---}, then [p]€ ker(y) is equivalent to the congruence p= 
a, 3,Y,-.. mod 4n, which is exactly the kind of condition we were looking 
for. Actually, Lemma 1.14 allows us to refine this a bit: when n = 3 mod 4, 
then congruence can be taken to be of the form p=a,/f,7,... mod n (see 
Exercise 1.14). We should also note that in all cases, the usual statement of 
quadratic reciprocity makes it easy to compute the classes in question (see 
Exercise 1.15 for an example). 

To see how this relates to what Euler did in 1744, let N be as above, and 
let D=4N in Lemma 1.14. Then ker(y) consists exactly of Euler’s +a’s 
(when N > 0, the lemma also implies that —1 € ker(y), which explains the 
+ signs). The second thing to note is that when N is odd and squarefree, 
K = ker(x) is uniquely characterized by the following four properties: 

(i) K is a subgroup of index 2 in (Z/4NZ)*. 
(ii) -1€ K when N >O and -1¢ K when N <0. 
(iii) K has period N if N =1mod 4 and period 4N otherwise. (Having 
period P > 0 means that if [a],[b] € (Z/4NZ)*, [a]€ K and a=b mod 
P, then [b] € K.) 
(iv) K does not have any smaller period. 


D. BEYOND QUADRATIC RECIPROCITY 19 


For a proof of this characterization, see Weil [106, pp. 287-291]. In the 
Annotations to his 1744 paper, Euler gives very clear statements of (1)-(ili) 
(see Annotations 13-16 in [33, Vol. II, pp. 216-217]), and as for (iv), he 
notes that N is not a period when N #1 mod 4, but says nothing about the 
possibility of smaller periods (see Annotation 20 in [33, Vol. I], p. 219)). 
So Euler doesn’t quite give a complete characterization of ker(y), but he 
comes incredibly close. It is a tribute to Euler’s insight that he could deduce 
this underlying structure on the basis of examples like (1.8). 


D. Beyond Quadratic Reciprocity 


We will next discuss some of Euler’s conjectures concerning primes of the 

form x? + ny? for n > 3. We start with the cases n = 5 and 14 (taken from 

his 1744 paper), for each will have something unexpected to offer us. 
When n = 5, Euler conjectured that for odd primes p #5, 


(1.20) pHxr4+5y <> p =1,9 mod 20 
) 2p = x2 +5y? <> p=3,7 mod 20. 


Recall from (1.8) that p|x?+5y? is equivalent to p = 1,3,7,9 mod 20. 
Hence these four congruence classes break up into two groups {1,9} and 
{3,7} which have quite different representability properties. This is a new 
phenomenon, not encountered for x? + ny? when n < 3. Note also that the 
classes 3,7 modulo 20 are the ones that entered into Fermat’s speculations 
on x? + 5y*, so something interesting is going on here. In §2 we will see 
that this is one of the examples that led Lagrange to discover genus theory. 

The case n = 14 is yet more complicated. Here, Euler makes the follow- 
ing conjecture for odd primes # 7: 


2 2 
7 { : a <> p=1,9,15,23,25,39 mod 56 


3p =x? + 14y? <> p =3,5,13,19,27,45 mod 56. 


(1.21) 


As with (1.20), the union of the two groups of congruence classes in (1.21) 
describe those primes for which (—14/p) = 1. The new puzzle here is that 
we don’t seem to be able to separate x? + 14y? from 2x? + 7y?. In §2, we 
will see that this is not an oversight on Euler’s part, for the two quadratic 
forms x? + 14y* and 2x” +7y? are in the same genus and hence can’t be 
separated by congruence classes. Another puzzle is why (1.20) uses 2p while 
(1.21) uses 3p. In §2 we will use composition to explain these facts. One 
could also ask what extra condition is needed to insure p = x” + 14y”. This 
lies much deeper, for as we will see in §5, it involves the Hilbert class field 


of Q(/—-14). 
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The final examples we want to discuss come from quite a different 
source, the 7ractatus de numerorum doctrina capita sedecim quae supersunt, 
which Euler wrote in the period 1748-1750 [33, Vol. V, pp. 182-283]. Euler 
intended this work to be a basic text for number theory, in the same way 
that his Introductio in analysin infinitorum [33, Vol. VIII-IX] was the first 
real textbook in analysis. Unfortunately, Euler never completed the Tracta- 
tus, and it was first published only in 1849. Weil [106, pp. 192-196] gives a 
description of what’s in the Tractatus (see also [33, Vol. V, pp. XIX-XXVIJ]). 
For us, the most interesting chapters are the two that deal with cubic and 
biquadratic residues. Recall that a number a is a cubic (resp. biquadratic) 
residue modulo p if the congruence x? = a mod p (resp. x4 = a mod p) has 
an integer solution. Euler makes the following conjectures about when 2 is 
a cubic or biquadratic residue modulo an odd prime p: 


> 3 p=1mod 3 and 2 isa 
(1.22) p=x'+2Ty° <= 
cubic residue modulo p 


5 p=1mod4 and 2isa 
(1.23) p=x?+64y* <> 
biquadratic residue modulo p 


(see [33, Vol. V, pp. 250 and 258]). In §4, we will see that both of these 
conjectures were proved by Gauss as consequences of his work on cubic 
and biquadratic reciprocity. 

The importance of the examples (1.20)-(1.23) is hard to overestimate. 
Thanks to Euler’s amazing ability to find patterns, we now see some of 
the serious problems to be tackled (in (1.20) and (1.21)), and we have our 
first hint of what the final solution will look like (in (1.22) and (1.23)). 
Much of the next three sections will be devoted to explaining and proving 
these conjectures. In particular, it should be clear that we need to learn 
a lot more about quadratic forms. Euler left us with a magnificent series 
of examples and conjectures, but it remained for Lagrange to develop the 
language which would bring the underlying structure to light. 


E. Exercises 


1.1. In this exercise, we prove some identities used by Euler. 
(a) Prove (1.3) and its generalization (1.6). 
(b) Generalize (1.6) to find an identity of the form 


(ax* +cy*)(az’ + cw’) = (2 +.ac(?). 


This is due to Euler [33, Vol. I, p. 424]. 
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1.2. Let p be prime, and let f(x) be a monic polynomial of degree d < p. 


1.3. 


1.4. 


1.5. 


1.6. 
1.7. 


1.8. 
1.9. 


This exercise will describe Euler’s proof that the congruence f(x) # 
0 mod p has a solution. Let Af (x) = f(x + 1)—f (x) be the differ- 
ence operator. 


(a) For any k > 1, show that A‘ f(x) is an integral linear combina- 
tion of f(x), f(x + 1),...,f(x +k). 

(b) Show that A? f(x) = d!. 

(c) Euler’s argument is now easy to state: if f(x) #0 mod p has no 
solutions, then p | A@ f(x) follows from (a). By (b), this is impos- 
sible. 

Let n be a positive integer. 

(a) Formulate and prove a version of Lemma 1.4 when a prime q = 
x? + ny? divides a number N = a? + nb’. 

(b) Show that your proof of (a) works when n = 3 and q = 4. 

In this exercise, we will prove the Descent Steps for x? + 2y? and 

x? + 3y?. 

(a) If a prime p divides x? + 2y?, gcd(x,y) = 1, then adapt the argu- 
ment of Theorem 1.2 to show that p = x” + 2y?. Hint: use Exer- 
cise 1.3. 


(b) Prove that if an odd prime p divides x? + 3y”, gcd(x,y) = 1, then 
p =x*+3y*. The argument is more complicated because the 
Descent Step fails for p = 2. Thus, if it fails for some odd prime 
p, you have to produce an odd prime gq < p where it also fails. 
Hint: part (b) of Exercise 1.3 will be useful. 


If p = 3k + 1 is prime, prove that (—3/p) = 1. Hint: 
4(x* —l1)= (x* —1)- 4(x7* +x 4 1) 
= (x* — 1)((2x* + 1)* + 3). 
Note that Exercises 1.4(b) and 1.5 prove Fermat’s theorem for x* + 
By’. 
Prove Lemma 1.7. 


Use the properties (1.11) of the Legendre symbol to prove the quad- 
ratic reciprocity is equivalent to (1.12). 


Prove (1.13). 


In this exercise we will see how the Reciprocity Steps for x* + y?, 
x? + 2y* and x? + 3y? relate to quadratic reciprocity. 


(a) Use Lemma 1.7 to show that for a prime p > 3, 
p|x? +3y’, ged(x,y)=1 <— > p=1mod3 
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is equivalent to 


By (1.12), we recognize this as part of quadratic reciprocity. 


(b) Use Lemma 1.7 and the bottom line of (1.11) to show that the 
statements 


pl|x ty’, gcd(x,y)=1< > p=1mod4 
p|x?+2y’, ged(x,y)=1< > p=1,3 mod 8 


are equivalent to the statements 


(=) = (=e 
(=) = (-1)?"-V/8, 


1.10. This exercise is concerned with the properties of the Jacobi symbol 
(M /m) defined in the proof of Lemma 1.14. 
(a) Prove that (M/m) = (N/m) when M =N mod m. 
(b) Prove (1.15). 
(c) Prove (1.16) using quadratic reciprocity and the two supplemen- 
tary laws (—1/p) = (-1)(?-/? and (2/p) = (-1)@"-)/8, Hint: 
if r and s are odd, show that 


(rs —1)/2 =(r —1)/2+ (s —1)/2 mod 2 
(r2s* —1)/8 = (r? — 1)/8 + (s? — 1)/8 mod 2. 


(d) If M is a quadratic residue modulo m, show that (M/m) = 1. 
Give an example to show that the converse is not true. 


1.11. Use (1.15) and (1.16) to complete the proof of (1.17) begun in the 
text. 


1.12. This exercise is concerned with the map y:(Z/DZ)* > {+1} of 
Lemma 1.14. When m is odd and positive, we define y([m]) to be 
the Jacobi symbol (D/m). 
(a) Show that any class in (Z/DZ)* may be written as [m], where 
m is odd and positive, and then use (1.17) to show that y is a 
well-defined homomorphism on (Z/DZ)*. 
(b) Show that 
1 ifD>0 


Ry -1 ifD<0. 
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(c) If D = 1 mod 4, show that 
1 if D=1mod8 


x(2) = —1 if D = 5 mod 8. 


1.13. In this exercise, we will assume that Lemma 1.14 holds for all non- 
zero integers D = 0,1 mod 4, and we will prove quadratic reciprocity 
and the supplementary laws. 

(a) Let p and q be distinct odd primes, and let g* = (-1)4¢-)/?q. 
By applying the lemma with D = q*, show that (q*/-) induces 
a homomorphism from (Z/qZ)* to {+1}. Since (-/q) can be 
regarded as a homomorphism between the same two groups and 
(Z/qZ)* is cyclic, conclude that the two are equal. 

(b) Use similar arguments to prove the supplementary laws. Hint: 
apply the lemma with D = —4 and 8 respectively. 


1.14. Use Lemma 1.14 to prove that when n = 3 mod 4, there are integers 
a,3,7,-.. such that for an odd prime p not dividing n, p| x* +ny?, 
gcd(x, y) = 1 if and only if p =a,f,7,... mod n. 


1.15. Use quadratic reciprocity to determine those classes in (Z/84Z)* 
with (—21/p) =1. This tells us when p| x? + 21y’, and thus solves 
Reciprocity Step when n = 21. 


1.16. In the discussion following the proof of Lemma 1.14, we stated that 
K = ker(y) is characterized by the four properties (i)-(iv). When 
D = 4q, where q is an odd prime, prove that (i) and (ii) suffice to 
determine K uniquely. 


§2. LAGRANGE, LEGENDRE AND QUADRATIC FORMS 


The study of integral quadratic forms in two variables 
f(x,y) =ax* + bxy +cy?’, a,b,cEZ 


began with Lagrange, who introduced the concepts of discriminant, equiv- 
alence and reduced form. When these are combined with Gauss’ notion of 
proper equivalence, one has all of the ingredients necessary to develop the 
basic theory of quadratic forms. We will concentrate on the special case of 
positive definite forms. Here, Lagrange’s theory of reduced forms is espe- 
cially nice, and in particular we will get a complete solution of the Descent 
Step from §1. When this is combined with the solution of the Reciprocity 
Step given by quadratic reciprocity, we will get immediate proofs of Fer- 
mat’s theorems (1.1) as well as several new results. We will then describe 
an elementary form of genus theory due to Lagrange, which will enable us 
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to prove some of Euler’s conjectures from §1, and we will also be able to 
solve our basic question of p = x* + ny” for quite a few n. The section will 
end with some historical remarks concerning Lagrange and Legendre. 


A. Quadratic Forms 


Our treatment of quadratic forms is taken primarily from Lagrange’s “Re- 
cherches d’Arithmétique” of 1773-1775 [69, pp. 695-795] and Gauss’ Dis- 
quisitiones Arithmeticae of 1801 [41, §§153-226]. Most of the terminology 
is due to Gauss, though many of the terms he introduced refer to concepts 
used implicitly by Lagrange (with some important exceptions). 

A first definition is that a form ax* + bxy + cy” is primitive if its coef- 
ficients a, b and c are relatively prime. Note that any form is an integer 
multiple of a primitive form. We will deal exclusively with primitive forms. 

An integer m is represented by a form f (x,y) if the equation 


(2.1) m= f(x,y) 


has an integer solution in x and y. If the x and y in (2.1) are relatively 
prime, we say that m is properly represented by f (x,y). Note that the basic 
question of the peor can be restated as: which primes are represented by 
the quadratic form x? + ny?? 

Next, we say that two forms f(x,y) and g(x,y) are equivalent if there 
are integers p, q, r and s such that 


(2.2) f(x,y) =g(px + qy,rx +sy) and ps—qr=Hl. 


Since det(?%) = ps —qr = +1, this means that (?%) is in the group of 
2x2 invertible integer matrices GL(2,Z), and it follows easily that the 
equivalence of forms is an equivalence relation (see Exercise 2.2). An im- 
portant observation is that equivalent forms represent the same numbers, 
and the same is true for proper representations (see Exercise 2.2). Note 
also that any form equivalent to a primitive form is itself primitive (see Ex- 
ercise 2.2). Following Gauss, we say that an equivalence is a proper equiva- 
lence if ps — qr =1, ie. (? 4) € SL(2,Z), and it is an improper equivalence 
if ps —qr = -1 [4], $158}. Since SL(2,Z) is a subgroup of GL(2,Z), it fol- 
lows that proper equivalence is also an equivalence relation (see Exercise 
2D): 

The notion of equivalence is due to Lagrange, though he simply said 
that one form “can be transformed into another of the same kind” [69, p. 
723]. Neither Lagrange nor Legendre made use of proper equivalence. The 
terms “equivalence” and “proper equivalence” are due to Gauss [41, 8157], 
and after stating their definitions, Gauss promises that “the usefulness of 
these distinctions will soon be made clear” [41, §158]. In §3 we will see that 
he was true to his word. 
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As an example of these concepts, note that the forms ax? + bxy + cy? 
and ax? — bxy + cy? are always improperly equivalent via the substitution 
(x,y) + (x,y). But are they properly equivalent? This is not obvious. We 
will see below that the answer is sometimes yes (for 2x” +2xy + 3y7) and 
sometimes no (for 3x? +2xy + Sy’). 

There is a very nice relation between proper representations and proper 
equivalence: 


Lemma 2.3. A form f(x,y) properly represents an integer m if and only 
if f(x,y) is properly equivalent to the form mx? +bxy+cy* for some 
b,c € Z. 


Proof. First, suppose that f(p,q) = m, where p and q are relatively prime. 
We can find integers r and s so that ps — qr = 1, and then 


f (px + ry,qx + sy)= f (p,q)x? + (2apr + bps + brq + 2cqs)xy 
+ f (r,s)y? = mx? + bxy + cy? 


is of the desired form. To prove the converse, note that mx? + bxy + cy? 
represents m properly by taking (x,y) = (1,0), and the lemma is proved. 
Q.E.D. 


We define the discriminant of ax* + bxy + cy? to be D = b? — 4ac. To 
see how this definition relates to equivalence, suppose the forms f(x, y) 
and g(x,y) have discriminants D and D' respectively, and that 


f(xy) =8(pxtaqy,rxtsy), DGrsel. 
Then a straightforward calculation shows that 
D = (ps —qryD", 


(see Exercise 2.3), so that the two forms have the same discriminant when- 
ever ps — qr = +1. Thus equivalent forms have the same discriminant. 

The sign of the discriminant D has a strong effect on the behavior of the 
form. If f(x,y) = ax? + bxy + cy’, then we have the identity 


(2.4) 4a f (x,y) = (2ax + by)’ — Dy”. 


If D>0, then f(x,y) represents both positive and negative integers, and 
we call the form indefinite, while if D <0, then the form represents only 
positive integers or only negative ones, depending on the sign of a, and 
f(x,y) is accordingly called positive definite or negative definite (see Exer- 
cise 2.4). Note that all of these notions are invariant under equivalence. 

The discriminant D influences the form in one other way: since D = 
b* — 4ac, we have D = b* mod 4, and it follows that the middle coefficient 
b is even (resp. odd) if and only if D = 0 (resp. 1) mod 4. 
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We have the following necessary and sufficient condition for a number 
m to be represented by a form of discriminant D: 


Lemma 2.5. Let D = 0,1 mod 4 be an integer and m be an odd integer rel- 
atively prime to D. Then m ts properly represented by a primitive form of 
discriminant D if and only if D is a quadratic residue modulo m. 


Proof. If f(x,y) properly represents m, then by Lemma 2.3, we may as- 
sume f(x,y) = mx? + bxy +cy*. Thus D = b*—4mc, and D=b* mod m 
follows immediately. 

Conversely, suppose that D = b* mod m. Since m is odd, we can as- 
sume that D and b have the same parity (replace b by b+m if neces- 
sary), and then D=0,1 mod 4 implies that D = b* mod 4m. This means 
that D = b> —4mc for some c. Then mx* + bxy + cy? represents m prop- 
erly and has discriminant D, and the coefficients are relatively prime since 
m is relatively prime to D. Q.E.D. 


For our purposes, the most useful version of Lemma 2.5 will be the fol- 
lowing corollary: 


Corollary 2.6. Let n be an integer and let p be an odd prime not dividing 
n. Then (—n/p)=1 Uf and only if p ts represented by a primitive form of 
discriminant —4n. 


Proof. This follows immediately from Lemma 2.5 because —4n is a qua- 
dratic residue modulo p if and only if (—4n/p) = (—n/p) = 1. Q.E.D. 


This corollary is relevant to the question raised in §1 when we tried to 
generalize the Descent Step of Euler’s strategy. Recall that we asked how 
to represent prime divisors of x* + ny”, gcd(x,y) = 1. Note that Corollary 
2.6 gives a first answer to this question, for such primes satisfy (—n/p) = 
1, and hence are represented by forms of discriminant —4n. The problem 
is that there are too many quadratic forms of a given discriminant. For 
example, if the proof of Lemma 2.5 is applied to (—3/13) = 1, then we see 
that 13 is represented by the form 13x? + 12xy + 3y of discriminant —12. 
This is not very enlightening. So to improve Corollary 2.6, we need to show 
that every form is equivalent to an especially simple one. Lagrange’s theory 
of reduced forms does this and a Jot more. 

So far, we’ve dealt with arbitrary quadratic forms, but from this point 
on, we will specialize to the positive definite case. These forms include the 
ones we’re most interested in (namely, x? + ny? for n > 0), and their theory 
has a classical simplicity and elegance. In particular, there is an especially 
nice notion of reduced form. 
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A primitive positive definite form ax? + bxy + cy” is said to be reduced 
if 


(2.7) |b] <a <c, and b> Oif either |b] = a or a =c. 


(Note that a and c are positive since the form is positive definite.) The 
basic theorem is the following: 


Theorem 2.8. Every primitive positive definite form is properly equivalent to 
a unique reduced form. 


Proof. The first step is to show that a given form is properly equivalent 
to one satisfying |b] <a <c. Among all forms properly equivalent to the 
given one, pick f(x,y) = ax? + bxy + cy? so that |b| is as small as possible. 
If a < |b|, then 


g(x,y) =f(x+my,y)= ax? + (2am + b)xy + cy? 


is properly equivalent to f(x,y). Since a < |b|, we can choose m € Z so that 
|2am + b| < |b], which contradicts our choice of f(x,y). Thus a > |b|, and 
c > |b| follows similarly. If a >c, we need to interchange the outer coef- 
ficients, which is accomplished by the proper equivalence (x,y) (—y,x). 
The resulting form satisfies |b] <a<c. 

The next step is to show that such a form is properly equivalent to a re- 
duced one. By definition (2.7), the form is already reduced unless b < 0 and 
a= —b or a =c. In these exceptional cases, ax? —bxy + cy” is reduced, so 
that we need only show that the two forms ax* +bxy + cy* are properly 
equivalent. This is done as follows: 


2 


a=—b: (x,y) (x+y,y) takes ax ~axy+cy* to ax*+axy+cy’. 


a=c :(x,y)H+(-y,x) takes ax? +bxy+ay? to ax*—bxy +ay’. 


The final step in the proof is to show that different reduced forms can- 
not be properly equivalent. This is the uniqueness part of the theorem. If 
f(x,y) = ax? + bxy + cy? satisfies |b] < a <c, then one easily shows that 


(2.9) f(x,y) > (a — |b| + c)min(x’, y*) 


(see Exercise 2.7). Thus f(x,y) >a— |b|+c whenever xy #0, and it fol- 
lows that a is the smallest nonzero value of f(x,y). Furthermore, if c> a, 
then c is the next smallest number represented properly by f(x,y), so that 
in this case the outer coefficients of a reduced form give the minimum val- 
ues properly represented by any equivalent form. These observations are 
due to Legendre [74, Vol. I, pp. 77-78]. 
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We can now prove uniqueness. For simplicity, assume that f(x,y) = 
ax* + bxy + cy” is a reduced form that satisfies the strict inequalities |b| < 
a<c. The above considerations imply that 


(2.10) a<c<a-—|bl+c 


are the three smallest numbers properly represented by f(x, y). Using these 
inequalities and (2.9), it follows that 


f(x,y) = 4, ged(x,y) = 1 <> (x,y) = £(1,9) 
f(x,y) =e, ged(x,y) =1 <=> (x,y) = +(0,1) 


(see Exercise 2.8). Now let g(x, y) be a reduced form equivalent to f(x,y). 
Since these forms represent the same numbers and are reduced, they must 
have the same first coefficient a by Legendre’s observation. Now consider 
the third coefficient c’ of g(x,y). We know that a <c’ since g(x,y) is re- 
duced. If equality occurred, then the equation g(x,y) = a would have four 
proper solutions +(1,0) and +(0,1). Since f(x,y) is equivalent to g(x,y), 
this would contradict (2.11). Thus a<c’, and then Legendre’s observa- 
tion shows that c = c’. Hence the outer coefficients of f(x,y) and g(x,y) 
are the same, and since they have the same discriminant, it follows that 
g(x,y) =ax?+bxy +cy’. 

It remains to show that f(x, y) = g(x,y) when we make the stronger as- 
sumption that the forms are properly equivalent. If we assume that 


g(x,y)=f(px+qy,rxt+sy), ps—qr=1, 


then a = g(1,0) = f(p,q) and c =g(0,1) =f (r,s) are proper representa- 
tions. By (2.11), it ONS ey ier +(1,0) and (r,s) = +(0,1). Then 
ps — qr = 1 implies (? 4) , and f(x,y) = g(x,y) follows easily. 
When a = |b| or a = ) a a) argument breaks down, because the 
values in (2.10) are no longer distinct. Nevertheless, one can still show that 
f(x,y) and g(x,y) reduce to ax? + bxy +cy’, and then the restriction b > 
0 in definition (2.7) implies equality. (See Exercise 2.8, or for the complete 
details, Scharlau and Opolka [86, pp. 36-38].) Q.E.D. 


(2.11) 


Note that we can now answer our earlier question about equivalence 
versus proper equivalence. Namely, the forms 3x7 +2xy +5y? are clearly 
equivalent, but since they are both reduced, Theorem 2.8 implies that they 
are not properly equivalent. On the other hand, of 2x*+2xy + 3y7, only 
2x* + 2xy + 3y? is reduced (because a = |b|), and by the proof of Theorem 
2.8, it is properly equivalent to 2x* — 2xy + 3y?. 

In order to complete the elementary theory of reduced forms, we need 
one more observation. Suppose that ax* + bxy + cy” is a reduced form of 
discriminant D < 0. Then b* < a” and a <c, so that 


—D = 4ac — b* > 4a? —a* = 3a’ 
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and thus 


(2.12) a<V(-D)/3. 


If D is fixed, then |b] < a and (2.12) imply that there are only finitely many 
choices for a and b. Since b* — 4ac = D, the same is true for c, so that 
there are only a finite number of reduced forms of discriminant D. Then 
Theorem 2.8 implies that the number of proper equivalence Classes is also 
finite. Following Gauss [41, §223], we say that two forms are in the same 
class if they are properly equivalent. We will let h(D) denote the number 
of classes of primitive positive definite forms of discriminant D, which by 
Theorem 2.8 is just the number of reduced forms. We have thus proved the 
following theorem: 


Theorem 2.13. Let D < 0 be fixed. Then the number h(D) of classes of prim- 
itive positive definite forms of discriminant D is finite, and furthermore h(D) 
is equal to the number of reduced forms of discriminant D. Q.E.D. 


The above discussion also shows that there is an algorithm for computing 
reduced forms and class numbers which, for small discriminants, is easily 
implemented on a computer (see Exercise 2.9). Here are some examples 
which will prove useful later on: 


D Reduced Forms of Discriminant D 
—4 x7 + y? 
—8 x? +2y? 
—12 x? + 3y? 
(2.14) —20 x? + 5y?,2x? + 2xy + 3y? 
— x? + Ty? 
—56 x? + 14y*,2x? + Ty? 3x* 4+ 2xy + 5y? 
—108 x? + 27Ty?,4x* + 2xy + Ty? 
~256 x? + 64y?,4x7 + 4xy + 17y?,5x* 4+ 2xy + 13y? 


Note, by the way, that x? + ny is always a reduced form! For a further 
discussion of the computational aspects of class numbers, see Buell [12] and 
Shanks [89] (the algorithm described in [89] makes nice use of the theory 
to be described in §3). 

This completes our discussion of positive definite forms. We should also 
mention that there is a corresponding theory for indefinite forms. Its roots 
reach back to Fermat and Euler (both considered special cases, such as 
x*—2y), and Lagrange and Gauss each developed a general theory of 
such forms. There are notions of reduced form, class number, etc., but 
the uniqueness problem is much more complicated. As Gauss notes, “it 


30 §2. LAGRANGE, LEGENDRE AND QUADRATIC FORMS 


can happen that many reduced forms are properly equivalent among them- 
selves” [41, §184]. Determining exactly which reduced forms are prop- 
erly equivalent is not easy (see Lagrange [69, pp. 728-740] and Gauss [41, 
§§183-193]). There are also connections with continued fractions and Pell’s 
equation (see [41, §§183—205]}), so that the indefinite case has a very differ- 
ent flavor. Two modern references are Flath [36, Chapter IV] and Zagier 
[111, §§8, 13 and 14]. 


B. p = x? +ny* and Quadratic Forms 


We can now apply the theory of positive definite quadratic forms to solve 
some of the problems encountered in §1. We start by giving a complete 
solution of the Descent Step of Euler’s strategy: 


Proposition 2.15. Let n be a positive integer and p be an odd prime not 
dividing n. Then (—n/p)=1 if and only if p is represented by one of the 
h(—4n) reduced forms of discriminant —4n. 


Proof. This follows immediately from Corollary 2.6 and Theorem 2.8. 
Q.E.D. 


In §1 we showed how quadratic reciprocity gives a general solution of the 
Reciprocity Step of Euler’s strategy. Having just solved the Descent Step, it 
makes sense to put the two together and see what we get. But rather than 
just treat the case of forms of discriminant —4n, we will state a result that 
applies to all negative discriminants D < 0. Recall from Lemma 1.14 that 
there is a homomorphism y : (Z/DZ)* — {+1} such that y([p]) = (D/p) 
for odd primes not dividing D. Note that ker(y) C (Z/DZ)* is a subgroup 
of index 2. We then have the following general theorem: 


Theorem 2.16. Let D=0,1 mod 4 be negative, and let x :(Z/DZ)* > 
{+1} be the homomorphism from Lemma 1.14. Then, for an odd prime p 
not dividing D, [p] € ker(x) if and only if p is represented by one of the h(D) 
reduced forms of discriminant D. 


Proof. The definition of x tells us that [p] € ker(y) if and only if (D/p) = 
1. By Lemma 2.5, this last condition is equivalent to being represented by 
a primitive positive definite form of discriminant D, and then we are done 
by Theorem 2.8. Q.E.D. 


The basic content of this theorem is that there is a congruence p= 
a, 3,Y,... mod D which gives necessary and sufficient conditions for an odd 
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prime p to be represented by a reduced form of discriminant D. This re- 
sult is very Computational, for we know how to find the reduced forms, and 
quadratic reciprocity makes it easy to find the congruence classes a, f,7,... 
mod D such that (D/p) = 1. 

For an example of how Theorem 2.16 works, note that x” + y*, x? + 2y? 
and x” + 3y? are the only reduced forms of discriminants —4, —8 and —12 
respectively (this is from (2.14)). Using quadratic reciprocity to find the 
congruence classes for which (—1/p), (—2/p) and (—3/p) equal 1, we get 
immediate proofs of Fermat’s three theorems (1.1) (see Exercise 2.11). This 
shows just how powerful a theory we have: Fermat’s theorems are now 
reduced to the status of an exercise. We can also go beyond Fermat, for 
notice that by (2.14), x2 +7y is the only reduced form of discriminant 
—28, and it follows easily that 


(2.17) pH=xr+Ty <> p=1,9,11,15,23, 25 mod 28 


for primes p # 7 (see Exercise 2.11). Thus we have made significant pro- 
gress in answering our basic question of when p = x? + ny?. 

Unfortunately, this method for characterizing p = x? + ny* works only 
when h(—4n) = 1. In 1903, Landau proved a conjecture of Gauss that there 
are very few n’s with this property: 


Theorem 2.18. Let n be a positive integer. Then 


h(—4n) = 1 <> n= 1,2,3,4 or 7. 


Proof. We will follow Landau’s proof [70]. The basic idea is very simple: 
x*+ ny? is a reduced form, and for n ¢ {1,2,3,4,7}, we will produce a 
second reduced form of the same discriminant, showing that h(—4n) > 1. 
We may assume 7 > 1. 

First suppose that n is not a prime power. Then n can be written n = ac, 
where 1<a<c and gcd(a,c) = 1 (see Exercise 2.12), and the form 


ax? +cy? 


is reduced of disciminant —4ac = —4n. Thus h(—4n) > 1 when n is not a 
prime power. 
Next suppose that n = 2’. If r > 4, then 


4x? + d4xy +(2’-? + 1)y? 


has relatively prime coefficients and is reduced since 4 < 2’-* +1. Fur- 
thermore, it has discriminant 47 — 4. 4(2"~* + 1) = —16-2’~? = —4n. Thus 
h(—4n) > 1 when n = 2’, r >4. One computes directly that h(—4-8) = 2 
(see Exercise 2.12), which leaves us with the known cases n = 2 and 4. 
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Finally, assume that n = p’, where p is an odd prime. If n+ 1 can be 
written n + 1 = ac, where 2<a<c and gcd(a,c) = 1, then 


ax? +2xy +cy? 


is reduced of discriminant 2* — 4ac = 4— 4(n + 1) = —4n. Thus A(—4n) > 1 
when n+ 1 is not a prime power. But n = p’ is odd, so that n + 1 is even, 
and hence it remains to consider the case n + 1 = 2°. If s > 6, then 


8x7 + 6xy + (2°79 + ly? 


has relatively prime coefficients and is reduced since 8 < 25~* + 1. Further- 
more, it has discriminant 6? — 4-8(25-3 +1) =4-—4-25 =4-4(n+1)= 
—4n, and hence h(—4n)> 1 when s > 6. The cases 5 = 1, 2, 3, 4 and 5 
correspond to n= 1, 3, 7, 15 and 31 respectively. Now n= 15 is not a 
prime power, and one computes that h(—4-31) =3 (see Exercise 2.12). 
This leaves us with the three known cases n = 1, 3 and 7, and completes 
the proof of the theorem. Q.E.D. 


Note that we’ve already discussed the cases n = 1, 2, 3 and 7, and the 
case n = 4 was omitted since p = x? + 4y? is a trivial corollary of p = x? + 
y? (p is odd, so that one of x or y must be even). One could also ask if 
there is a similar finite list of odd discriminants D < 0 with h(D) = 1. The 
answer is yes, but the proof is much more difficult. We will discuss this 
problem in §7 and give a proof in §12. 


C. Elementary Genus Theory 


One consequence of Theorem 2.18 is that we need some new ideas to char- 
acterize p = x? + ny” when h(—4n) > 1. To get a sense of what’s involved, 
consider the example n = 5. Here, Theorem 2.16, quadratic reciprocity and 
(2.14) tell us that 


p = 1,3,7,9 mod 20 <> (=) =] 
(2.19) Pp 


— p= x* + 5y” or 2x* + 2xy + 3y’. 


We need a method of separating reduced forms of the same discriminant, 
and this is where genus theory comes in. The basic idea is due to Lagrange, 
who, like us, used quadratic forms to prove conjectures of Fermat and Eu- 
ler. But rather than working with reduced forms collectively, as we did in 
Theorem 2.16, Lagrange considers the congruence classes represented in 
(Z/DZ)* by a single form, and he groups together forms that represent the 
same Classes. This turns out to be the basic idea of genus theory! 
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Let’s work out some examples to see how this grouping works. When 
D = —20, one easily computes that 


2 2 . * 
oy" represents 1,9 in (Z/20Z) 

(2.20) 
2x? +2xy+3y* represents 3,7 in (Z/20Z)* 


while for D = —56 one has 


x? + 14y?, 2x7 +7y? represent 1,9, 15,23,25,29 in (Z/56Z)* 
2.21) 
3x74+2xy+5y? represent 3,5, 13,19,27,45 in (Z/56Z)* 


(see Exercise 2.14—the reduced forms are taken from (2.14)). In his mem- 
oir on quadratic forms, Lagrange gives a systematic procedure for deter- 
mining the congruence classes in (Z/DZ)* represented by a form of dis- 
criminant D [69, pp. 759-765], and he includes a table listing various re- 
duced forms together with the corresponding congruence classes [69, pp. 
766-767]. The examples in Lagrange’s table show that this is a very natural 
way to group forms of the same discriminant. 

In general, we say that two primitive positive definite forms of discrimi- 
nant D are in the same genus if they represent the same values in (Z/DZ)*. 
Note that equivalent forms represent the same numbers and hence are in 
the same genus. In particular, each genus consists of a finite number of 
classes of forms. The above examples show that when D = —20, there are 
two genera, each consisting of a single class, and when D = —56, there are 
again two genera, but this time each genus consists of two classes. 

The real impact of this theory becomes clear when we combine it with 
Theorem 2.16. The basic idea is that genus theory refines our earlier cor- 
respondence between congruence classes and representations by reduced 
forms. For example, when D = —20, (2.19) tells us that p= 1,3,7,9 mod 
20 <=> x? + S5y* or 2x? + 2xy + 3y*. If we combine this with (2.20), we 
obtain 


2 2 
p=x°4+5y" <> p=1,9 mod 20 
(2.22) 
p=2x? +2xy+3y* <> p=3,7 mod 20. 


Notice that the top line of (2.22) solves Euler’s conjecture (1.20) for when 
p= x’ +5y*! The thing that makes this work is that the two genera rep- 
resent disjoint values in (Z/20Z)*. Looking at (2.21), we see that the same 
thing happens when D = —56, and then using Theorem 2.16 it is straight- 
forward to prove that 


0.33) p= x? + 14y? or 2x7 +7y? <=> p=1,9,15,23, 25,39 mod 56 
) p=3x2+2xy + Sy? <= p=3,5,13,19,27,45 mod 56 
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(see Exercise 2.15). Note that the top line proves part of Euler’s conjecture 
(1.21) concerning x? + 14y?. 

In order to combine Theorem 2.16 and genus theory into a general the- 
orem, we must show that the above examples reflect the general case. We 
first introduce some terminology. Given a negative integer D = 0,1 mod 4, 
the principal form is defined by 


D 
fay D=0 mod 4 


xe +xyt y’, D=1mod4. 


It is easy to check that the principal form has discriminant D and is reduced 
(see Exercise 2.16). Note that when D = —4n, we get our friend x? + ny?. 
Using the principal form, we can characterize the congruence classes in 
(Z/DZ)* represented by a form of discriminant D: 


Lemma 2.24. Given a negative integer D=0,1 mod 4, let ker(y) c (Z/ 

DZ)* be as in Theorem 2.16, and let f(x,y) be a form of discriminant D. 

(i) The values in (Z/DZ)* represented by the principal form of discriminant 
D form a subgroup H C ker(x). 


(ii) The values in (Z/DZ)* represented by f(x,y) form a coset of H in 
ker(x). 


Proof. We first show that if a number m is prime to D and is represented by 
a form of discriminant D, then [m] € ker(y). By Exercise 2.1, we can write 
m = d’m', where m' is properly represented by f(x,y). Then y([m]) = 
y([d2m']) = x([d])*x([m']) = x({m']). Thus we may assume that m is prop- 
erly represented by f(x,y), and then Lemma 2.5 implies that D is a qua- 
dratic residue modulo m, i.e., D = b* —km for some b and k. When m is 
odd, the properties of the Jacobi symbol (see Lemma 1.14) imply that 


D b?—km b? b\? 
xc = (Fe) = (Sa) = Ge) = Ge)? 
and our claim is proved. The case when m is even is covered in Exercise 
2.17. 
We now turn to statements (1) and (11) of the lemma. Concerning (i), the 
above paragraph shows that H Cc ker(y). When D = —4n, the identity (1.6) 


shows that H is closed under multiplication, and hence H is a subgroup. 
When D = 1 mod 4, the argument is slightly different: here, notice that 


(x4 ay y)=Qx+yp mod D, 
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which makes it easy to show that His in fact the subgroup of squares in 
(Z/DZ)* (see Exercise 2.17). 
To prove (ii), we need the following observation of Gauss [41, §228]: 


Lemma 2.25. Given a form f(x,y) and an integer M, then f(x,y) properly 
represents numbers relatively prime to M. 


Proof. See Exercise 2.18. Q.E.D. 


Now suppose that D = —4n. If we apply Lemma 2.25 with M = 4n and 
then use Lemma 2.3, we may assume that f(x,y) = ax? + bry + cy”, where 
a is prime to 4n. Since f(x,y) has discriminant —4n, 6 is even and can be 
written as 2b’, and then (2.4) implies that 


af (x,y) =(ax+ b'y)* + ny’. 


Since a is relatively prime to 4n, it follows that the values of f(x,y) in 
(Z/4nZ)* lie in the coset [a]~'H. Conversely, if [c] € [a]~'H, then ac= 
z* +nw? mod 4n for some z and w. Using the above identity, it is easy to 
solve the congruence f(x,y) =c mod 4n, and thus the coset [a]~!H con- 
sists exactly of the values represented in (Z/DZ)* by f(x,y). The case D = 
1 mod 4 is similar (see Exercise 2.17), and Lemma 2.24 is proved. Q.E.D. 


Since distinct cosets of H are disjoint, Lemma 2.24 implies that different 
genera represent disjoint values in (Z/DZ)*. This allows us to describe gen- 
era by cosets H' of H in ker(y). We define the genus of H' to consist of all 
forms of discriminant D which represent the values of H' modulo D. Then 
Lemma 2.24 immediately implies the following refinement of Theorem 2.16: 


Theorem 2.26. Let D =0,1 mod 4 be negative, and let H C ker(y) be as in 
Lemma 2.24. If H' is a coset of H in ker(x) and p is an odd prime not 
dividing D, then [p] € H' if and only if p ts represented by a reduced form of 
discriminant D in the genus of H'. Q.E.D. 


This theorem is the main result of our elementary genus theory. It general- 
izes examples (2.22) and (2.23), and it shows that there are always congru- 
ence conditions which characterize when a prime is represented by some 
form in a given genus. 

For us, the most interesting genus is the one containing the principal 
form, which following Gauss, we call the principal genus. When D = —4n, 
the principal form is x? +ny?, and since x7 +ny? is congruent modulo 
4n to x* or x* +n, depending on whether y is even or odd, we get the 
following explicit congruence conditions for this case: 
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Corollary 2.27. Let n be a positive integer and p an odd prime not dividing 
n. Then p is represented by a form of discriminant —4n in the principal 
genus if and only if for some integer B, 


p= or B? +n mod 4n. Q.E.D. 


There is also a version of this for discriminants D = 1 mod 4—see Exercise 
2.20. 

The nicest case of Corollary 2.27 is when the principal genus consists of 
a single class, for then we get congruence conditions that characterize p = 
x’ + ny”. This is what happened when n = 5 (see (2.22)), and this isn’t the 
only case. For example, the table of reduced forms in Lagrange’s memoir 
(69, pp. 766-767] shows that the same thing happens for n=6, 10, 13, 15, 
21, 22 and 30—for each of these n’s, the principal genus consists of only 
one class (see Exercise 2.21). Corollary 2.27 then gives us the following 
theorems for primes p: 


p=x*+6y? <> p=1,7 mod 24 
p=x"?+10y* <> p=1,9,11,19 mod 40 
p=x?+13y? => p=1,9,17,25,29, 49 mod 52 
(2.28) p=x?4+15y* <> p=1,19,31,49 mod 60 
p=x?+2ly? <> p=1,25,37 mod 84 
p=x?+22y? <> p=1,9,15, 23,25, 31, 47, 49, 71,81 mod 88 
p= x" +30y? <— > p=1,31,49,79 mod 120. 


It should be clear that this is a powerful theory! A natural question to ask 
is how often does the principal genus consist of only one class, 1.e., how 
many theorems like (2.28) do we get? We will explore this question in more 
detail in §3. 

The genus theory just discussed has been very successful, but it hasn’t 
solved all of the problems posed in 81. In particular, we have yet to prove 
Fermat’s conjecture concerning pq = x* + Sy’, and we’ve only done parts 
of Euler’s conjectures (1.20) and (1.21) concerning x° + Sy? and x? + 14y?. 
To complete the proofs, we again turn to Lagrange for help. 

Let’s begin with x? +5y?. We’ve already proved the part concerning 
when a prime p can equal x? + Sy* (see (2.22)), but it remains to show 
that for primes p and gq, 


(2.29) P,q = 3,7 mod 20 > pq = x* + Sy” (Fermat) 
- p=3,7mod2032p=x?+5y? (Euler). 
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Lagrange’s argument [69, pp. 788-789] is as follows. He first notes that 
primes congruent to 3 or 7 modulo 20 can be written as 2x* + 2xy + 3y? 
(this is (2.22)), so that both parts of (2.29) can be proved by showing that 
the product of two numbers represented by 2x* + 2xy + 3y? is of the form 
x” + 5y*. He then states the identity 


(2.30) (2x* + 2xy + 3y*)(2z* + 2zw + 3w’) 


= (2xz+xwt+yz+3yw) +5(xw— yz) 
see Exercise 2.22), and everything is proved! 
Exercise 2.22), and ything is proved! 


Turning to Euler’s conjecture (1.21) for x? + 14y”, we proved part of it 
in (2.23), but we still need to show that 


p = 3,5,13,19,27,45 mod 56 <> 3p =x? + 14y’. 


Using (2.23), it suffices to show that 3 times a number represented by 3x? + 
2xy + 5y”, or more generally the product of any two such numbers, is of the 
form x? + 14y*. So what we need is another identity of the form (2.30), and 
in fact there is a version of (2.30) that holds for any form of discriminant 
—4n: 


(2.31) (ax’ + 2bxy + cy’)(az? + 2bzw + cw’) 


= (axz + bxw + byz+cyw) + n(xw — yz) 
(see Exercise 2.21). Applying this to 3x? + 2xy + 5y” and n = 14, we are 
done. 

We can also explain one other aspect of Euler’s conjectures (1.20) and 
(1.21), for recall that we wondered why (1.20) used 2p while (1.21) used 3p. 
The answer again involves the identities (2.30) and (2.31): they show that 
2 (resp. 3) can be replaced by any value represented by 2x* + 2xy + 3y? 
(resp. 3x? +2xy + Sy”). But Legendre’s observation from the proof of The- 
orem 2.8 shows that 2 (resp. 3) is the best choice because it’s the smallest 
nonzero value represented by the form in question. We will see below and 
in §3 that identities like (2.30) and (2.31) are special cases of the composi- 
tion of quadratic forms. 

We now have complete proofs of Euler’s conjectures (1.20) and (1.21) 
for x2 + Sy” and x* + 14y*. Notice that we’ve used a lot of mathematics: 
quadratic reciprocity, reduced quadratic forms, genus theory and the com- 
position of quadratic forms. This amply justifies the high estimate of Euler’s 
insight that was made in §1, and Lagrange is equally impressive for provid- 
ing the proper tools to understand what lay behind Euler’s conjectures. 


D. Lagrange and Legendre 


We've already described parts of Lagrange’s memoir “Recherches d’Arith- 
métique”, but there are some further comments we’d like to add. First, 
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although we credit Lagrange with the discovery of genus theory, it appears 
only implicitly in his work. The groupings that appear in his tables of re- 
duced forms are striking, but Lagrange’s comments on genus theory are a 
different matter. On the page before the tables begin, Lagrange explains 
his grouping of forms as follows: “when two different [forms] give the same 
values of b [in (Z/4nZ)*], one combines these [forms] into the same case” 
[69, p. 765]. This is the sum total of what Lagrange says about genus theory! 

After completing the basic theory of quadratic forms (both definite and 
indefinite), Lagrange gives some applications to number theory. To moti- 
vate his results, he turns to Fermat and Euler, and he quotes from two 
of our main sources of inspiration: Fermat’s 1658 letter to Digby and Eu- 
ler’s 1744 paper on prime divisors of paa +qyy. Lagrange explicitly states 
Fermat’s results (1.1) on primes of the form x? +ny*, n= 1,2 or 3, and 
he notes Fermat’s speculation that pq = x? + Sy” whenever p and gq are 
primes congruent to 3 or 7 modulo 20. Lagrange also mentions several of 
Euler’s conjectures, including (1.20), and he adds “one finds a very large 
number of similar theorems in Volume XIV of the old Commentaires de 
Pétersbourg [where Euler’s 1744 paper appeared], but none of them have 
been demonstrated until now” [69, pp. 775-776]. 

The last section of Lagrange’s memoir is titled “Prime numbers of the 
form 4nm +b which are at the same time of the form x* +ny?” [69, p. 
775]. It’s clear that Lagrange wanted to prove Theorem 2.26, so that he 
could read off corollaries like (2.17), (2.22), (2.23) and (2.28). The problem 
is that these proofs depend on quadratic reciprocity, which Lagrange didn’t 
know in general—he could only prove some special cases. For example, he 
was able to determine (+2/p), (43/p) and (+5/p), but he had only partial 
results for (+7/p). Thus, he could prove all of (2.22) but only parts of the 
others (see [69, pp. 784-793] for the full list of his results). To get the flavor 
of Lagrange’s arguments, the reader should see Exercise 2.23 or Scharlau 
and Opolka [86, pp. 41-43]. At the end of the memoir, Lagrange summa- 
rizes what he could prove about quadratic reciprocity, stating his results in 
terms of Euler’s criterion 


(p-0)/2 = (=) d 
a = mod p. 
p Pp 
For example, for (2/p), Lagrange states [69, p. 794]: 
Thus, if p is a prime number of one of the forms 8n + 1, 2(?—D/? — 
1 will be divisible by p, and if p is of the form 8n +3, 2?-Y/* +41 
will thus be divisible by p. 


We next turn to Legendre. In his 1785 memoir “Recherches d’Analyse 
Indeterminée” [75], the two major results are first, a necessary and suffi- 
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cient criterion for the equation 
ax* + by? +cz? =0, a,b,cEZ 


to have a nontrivial integral solution, and second, a proof of quadratic reci- 
procity. Legendre was clearly influenced by Lagrange, but he replaces La- 
grange’s “2(?—1)/2 _ 1 will be divisible by p” by the simpler phrase “2(?-1)/2 
= 1”, where, as he warns the reader, “one has thrown out the multiples 
of p in the first member” [75, p. 516]. He then goes on to state quadratic 
reciprocity in the following form [75, p. 517]: 


c and d being two [odd] prime numbers, the expressions c@~))/? , 
d-)/? do not have different signs except when c & d are both of the 
form 4n — 1; in all other cases, these expressions will always have the 
same sign. 


Except for the notation, this is a thoroughly modern statement of quadratic 
reciprocity. Legendre’s proof is a different matter, for it is quite incomplete. 
We won’t examine the proof in detail—this is done in Weil [106, pp. 328- 
330 and 344-345]. Suffice it to say that some of the cases are proved rig- 
orously (see Exercise 2.24), some depend on Dirichlet’s theorem on primes 
in arithmetic progressions, and some are a tangle of circular reasoning. 

In 1798 Legendre published a more ambitious work, the Essai sur la 
Théorie des Nombres. (The third edition [74], published 1830, was titled 
Théorie des Nombres, and all of our references will be to this edition.) Le- 
gendre must have been dissatisfied with the notation of the ‘‘Recherches’’, 
for in the Essai he introduces the Legendre symbol (a/p). Then, in a sec- 
tion titled “Theorem containing a law of reciprocity which exists between 
two arbitrary prime numbers,” Legendre states that if n and m are distinct 


odd primes, then 
(=) Z (-ayrnain-ne( 2) 
m Hn 


(see [74, Vol. I, p. 230]). This is where our notation and terminology for 
quadratic reciprocity come from. Unfortunately, the Essai repeats Legen- 
dre’s incomplete proof from 1785, although by the 1830 edition there had 
been enough criticism of this proof that Legendre added Gauss’ third proof 
of reciprocity as well as one communicated to him by Jacobi (still maintain- 
ing that his original proof was valid). 

The Essai also contains a treatment of quadratic forms. Like Lagrange, 
one of Legendre’s goals was to prove theorems in number theory using 
quadratic forms. The difference is that Legendre knows quadratic reciproc- 
ity (Or at least he thinks he does), and this allows him to state a version of 
our main result, Theorem 2.26. Legendre calls it his ‘“Théoréme General”’ 
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[74, Vol. I, p. 299], and it goes as follows: if [a] is a congruence class lying 
in ker(y), then 


every prime number comprised of the form 4nx +a ...will con- 
sequently be given by one of the quadratic forms py* + 2gyz+rz? 
which correspond to the linear form 4nx +a. 


The terminology here is interesting. Euler and Lagrange would speak of 
numbers “of the form” 4nx +a or “of the form” ax? + bxy +cy*. As the 
above quote indicates, Legendre distinguished these two by calling them 
linear forms and quadratic forms respectively. This is where we get the 
term “quadratic form”. 

While Legendre’s ‘“Théoréme’”’ makes no explicit reference to genus the- 
ory, the context shows that it’s there implicitly. Namely, Legendre’s book 
has tables similar to Lagrange’s, with the forms grouped according to the 
values they represent in (Z/DZ)*. Since the explanation of the tables im- 
mediately precedes the statement of the ‘“Théoréme”’ [74, Vol. I, pp. 286- 
298], it’s clear that Legendre’s correspondence between linear forms and 
quadratic forms is exactly that given by Theorem 2.26. 

To Legendre, this theorem “is, without contradiction, one of the most 
general and most important in the theory of numbers” [74, Vol. I, p. 302]. 
Its main consequence is that every entry in his tables becomes a theorem, 
and Legendre gives several pages of explicit examples [74, Vol. I, pp. 305- 
307]. This is a big advance over what Lagrange could do, and Legendre 
notes that quadratic reciprocity was the key to his success [74, Vol. I, p. 
307): 


Lagrange is the first who opened the way for the study of these 
sorts of theorems. ... But the methods which served the great geome- 
ter are not applicable ... except in very few cases; and the difficulty 
in this regard could not be completely resolved without the aid of the 
law of reciprocity. 


Besides completing Lagrange’s program, Legendre also tried to under- 
stand some of the other ideas implicit in Lagrange’s memoir. We will dis- 
cuss one of Legendre’s attempts that is particularly relevant to our 
purposes: his theory of composition. Legendre’s basic idea was to gen- 
eralize the identity (2.30) 


(2x? + 2xy + 3y”)(2z? + 2zw + 3w’) 
= (2xz+xw+yz+3yw) +5(xw— yz) 


used by Lagrange in proving the conjectures of Fermat and Euler con- 
cerning x* + Sy*. We gave one generalization in (2.31), but Legendre saw 
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that something more general was going on. More precisely, let f(x,y) and 
g(x,y) be forms of discriminant D. Then a form F(x,y) of the same dis- 
criminant is their composition provided that 


f (%,y)8(2,w) = F(Bi(x, y;Z,w), Bo(x,y;Z,w)) 
where 
B(x, y3Z,w) = ajxz+bixwt+cjiyz+diyw, 1=1,2 


are bilinear forms in x,y and z,w. Thus Lagrange’s identity shows that 
x* + 5y? is the composition of 2x? + 2xy + 3y? with itself. And this is not 
the only example we’ve seen—the reader can check that (1.3), (1.6) and 
(2.31) are also examples of the composition of forms. 

A useful consequence of composition is that whenever F(x,y) is com- 
posed of f(x,y) and g(x,y), then the product of numbers represented by 
f(x,y) and g(x,y) will be represented by F(x,y). This was the idea that 
enabled us to complete the conjectures of Fermat and Euler for x* + Sy? 
and x* + 14y?. 

The basic question is whether any two forms of the same discriminant 
can be composed, and Legendre showed that the answer is yes [74, Vol. II, 
pp. 27-30]. For simplicitly, let’s discuss the case where the forms f(x,y) = 
ax? + 2bxy + cy? and g(x,y) = a'x? + 2b'xy +c'y” have discriminant —4n, 
and a and a’ are relatively prime (we can always arrange the last condition 
by changing the forms by a proper equivalence). Then the Chinese remain- 
der theorem shows that there is a number B such that 


B=+bmoda 


(2.32) 
B=-+b! mod a’. 


It follows that B? + n = b* + (ac —b*)=0 mod a, so that a| B* +n. The 
same holds for a’, and thus aa’| B* +n. Then Legendre shows that the 
form B+ 
F(x,y) = aa'x? + 2Bxy + ——=y? 

is the composition of f(x,y) and g(x,y). A modern account of Legendre’s 
argument may be found in Weil [106, pp. 332-335]), and we will consider 
this problem (from a slightly different point of view) in §3 when we discuss 
composition in more detail. 

Because of the + signs in (2.32), two forms in general may be com- 
posed in four different ways. For example, the forms 14x? + 10xy + 21y? 
and 9x* + 2xy + 30y* compose to the four forms 


126x7+38xy +5y*,  126x?+ 74xy + 13y’, 


and it is easy to show that these forms all lie in different classes (see Exer- 
cise 2.26). Since Legendre used equivalence rather than proper equivalence, 
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he sees two rather than four forms here—for him, this operation “leads in 
general to two solutions” [74, Vol. II, p. 28]. 

One of Legendre’s important ideas is that since every form is equivalent 
to a reduced one, it suffices to work out the compositions of reduced forms. 
The resulting table would then give the compositions of all possible forms 
of that discriminant. Let’s look at the case n = 41, which Legendre does in 
detail in [74, Vol. II, pp. 39-40]. He labels the reduced forms as follows: 


A=x'? +A4ly’ 

B= 2x? 4 2xy + 21y7 
(2.33) C=5x°+4xy + 9y* 

D = 3x? + 2xy + 14y? 

E = 6x? + 2xy + 7Ty?. 


(Legendre writes the forms slightly differently, but it’s more convenient to 
work with reduced forms.) He then gives the following table of composi- 
tions: 


(2.34) 
AA=A BB=A CC=AorB DD=AorC | EE=AarcC 


AB=B | BC=C | CD=DorE |} DE=BorcC 
AC=C | BD=E | CE=DorE 

AD=D | BE=D 

AE=E 


This almost looks like the multiplication table for a group, but the binary 
operation isn’t single-valued. To the modern reader, it’s clear that Legendre 
must be doing something slightly wrong. 

One problem is that (2.33) lists 5 forms, while the class number is 8. (C, 
D and E each give two reduced forms, while A and B each give only one.) 
This is closely related to the ambiguity in Legendre’s operation: as long as 
we work with equivalence rather than proper equivalence, we can't fix the 
sign of the middle coefficient 2b of a reduced form, so that the + signs in 
(2.32) are forced upon us. 

This suggests that composition might give a group operation on the class- 
es of forms of discriminant D. However, there remain serious problems 
to be solved. Composition, as defined above, is still a multiple-valued op- 
eration. Thus one has to show that the signs in (2.32) can be chosen uni- 
formly so that as we vary f(x,y) and g(x,y) within their proper equivalance 
classes, the resulting compositions are all properly equivalent. Then one has 
to worry about associativity, inverses, etc. There’s a lot of work to be done! 
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This concludes our discussion of Lagrange and Legendre. While the last 
few pages have raised more questions than answers, the reader should still 
be convinced of the richness of the theory of quadratic forms. The surpris- 
ing fact is that we have barely reached the really interesting part of the 
theory, for we have yet to consider the work of Gauss. 


E. Exercises 


2.1. 


2.2. 


2.3. 


2.4. 


2.5. 


2.6. 


2.7. 
2.8. 


If a form f(x,y) represents an integer m, show that m can be written 
m = d*m', where f(x,y) properly represents m’. 
In this exercise we study equivalence and proper equivalence. 


(a) Show that equivalence and proper equivalence are equivalence 
relations. 


(b) Show that improper equivalence is not an equivalence relation. 


(c) Show that equivalent forms represent the same numbers, and 
show that the same holds for proper representations. 

(d) Show that any form equivalent to a primitive form is itself primi- 
tive. Hint: use (c). 


Let f(x,y) and g(x,y) be forms of discriminants D and D’ respec- 
tively, and assume that there are integers p, q, r and s such that 


f (X,Y) = 8(px + qy,rx +sy). 
Prove that D = (ps —qr)D'. 
Let f(x,y) be a form of discriminant D # 0. 
(a) If D > 0, then use (2.4) to prove that f(x,y) represents both pos- 
itive and negative numbers. 
(b) If D < 0, then show that f(x,y) represents only positive or only 
negative numbers, depending on the sign of the coefficient of x’. 


Formulate and prove a version of Corollary 2.6 which holds for arbi- 
trary discriminants. 


Find a reduced form that is properly equivalent to the form 126x? + 
74xy + 13y*. Hint: make the middle coefficient small—see the proof 
of Theorem 2.8. 


Prove (2.9) for forms that satisfy |b] <a<c. 

This exercise is concerned with the uniqueness part of Theorem 2.8. 
(a) Prove (2.11). 

(b) Prove a version of (2.11) that holds in the exceptional cases |b| = 


a or a =C, and use this to complete the uniqueness part of the 
proof of Theorem 2.8. 
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2.9. 


2.10. 


2.11. 


2.12. 


2.13. 


2.14. 
2.15. 
2.16. 
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Write a computer program that computes all reduced forms for a 
given discriminant in the range —32768 < D <0. This range is easily 
implemented using the integer arithmetic of standard languages such 
as BASIC or Pascal. For example, one finds that h(—32767) = 52. If 
you don’t write a computer program, you should check the following 
examples by hand. 


(a) Verify the entries in table (2.14). 


(b) Compute all reduced forms of discriminants —3, —15, —24, —31, 
and —52. 


This exercise is concerned with indefinite forms of discriminant D > 
0, D not a perfect square. The last condition implies that the outer 
coefficients of a form with discriminant D are nonzero. 


(a) Adapt the proof of Theorem 2.8 to show that any form of dis- 
criminant D is properly equivalent to ax? + bxy + cy’, where 


|b] < Jal < |e]. 
b) If ax* + bxy + cy?” satifies the above inequalities, prove that 
yrcy q p 


JD 


<—. 
las 


(c) Conclude that there are only finitely many proper equivalence 
classes of forms of discriminant D. This proves that the class 
number h(D) is finite. 


Use Theorem 2.16, quadratic reciprocity and table (2.14) to prove 
Fermat’s three theorems (1.1) and the new result (2.17) for x? + 7y?. 


This exercise is concerned with the proof of Theorem 2.18. 


(a) If m > 1 1s an integer which is not a prime power, prove that m 
can be written m = ac where 1<a<c and gcd(a,c) = 1. 


(b) Show that h(—32) = 2 and h(—124) = 3. 


Use Theorem 2.16, quadratic reciprocity and table (2.14) to prove 
(2.19), and work out similar results for discriminants —3, —15, —24, 
—31 and —52. 


Prove (2.20) and (2.21). Hint: use Lemma 2.24. 
Prove (2.23). 


Let D be a number congruent to 1 modulo 4. Show that the form 
x? 4+xy+(1—D)/4y? has discriminant D, and show that it is re- 
duced when D < 0. 


2.17. 


2.18. 


2.19. 


2.20. 


2.21. 


2.22. 
2.23. 
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In this exercise, we will complete the proof of Lemma 2.24 for dis- 

criminants D = 1 mod 4. Let y:(Z/DZ)* — {+1} be as in Lemma 

1.14. 

(a) If an even number is properly represented by a form of discrim- 
inant D, then show that D = 1 mod 8. Hint: use Lemma 2.3. 


(b) If m is relatively prime to D and is represented by a form of 
discriminant D, then show that [m] € ker(y). Hint: use Lemma 
2.5 and, when m is even, (a) and Exercise 1.12(c). 

(c) Let H Cc (Z/DZ)* be the subgroup of squares. Show that H con- 
sists of the values represented by x? + xy +(1—D)/4y?. Hint: 
use 


4(P 42 + y)=@x +y) mod D. 

(d) If f(x,y) is a form of discriminant D, then show that the values 
in (Z/DZ)* represented by f(x,y) form a coset of H in ker(y). 
Hint: use (2.4). 

Let f(x,y) = ax? + bxy +cy’, where as usual we assume gcd(a, b,c) 

=1. 

(a) Given a prime p, prove that at least one of f(1,0), (0,1) and 
f (1,1) is relatively prime to p. 

(b) Prove Lemma 2.25. Hint: use (a) and the Chinese Remainder 
Theorem. 


Work out the genus theory of Theorem 2.26 for discriminants —15, 
—24, —31 and —52. Your answers should be similar to (2.22) and 
(2.23). 


Formulate and prove a version of Corollary 2.27 for negative dis- 
criminants D=1 mod 4. Hint: by Exercise 2.17(c), H is the sub- 
group of squares. 


Prove (2.28). Hint: for each n, find the reduced forms and use 
Lemma 2.24. 


Prove (2.30) and its generalization (2.31). 


The goal of this exercise is to prove that (—2/p)=1 when p= 
1,3 mod 8. The argument below is due to Lagrange, and is simi- 
lar to the one used by Euler in his proof of the Reciprocity Step for 
x? + 2y* [33, Vol. II, pp. 240-281]. 


(a) When p = 1 mod 8, write p = 8k +1, and then use the identity 
x 4 = ((x% — 1)? + 2x24) (A — 1) 
to show that (—2/p) = 1. 
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2.24. 


2.25. 


2.26. 


2.27. 
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(b) When p=3 mod 8, assume that (—2/p)=-—1. Show that 
(2/p) = 1, and thus by Corollary 2.6, p is represented by a form 
of discriminant 8. 

(c) Use Exercise 2.10(a) to show that any form of discriminant 8 is 
properly equivalent to +(x? — 2y?). 

(d) Show that an odd prime p = +(x” — 2y”) must be congruent to 
+1 modulo 8. 

From (a)-(d), it follows easily that (—2/p) = 1 when p = 1,3 mod 8. 


One of the main theorems is Legendre’s 1785 memoir [74, pp. 509- 
513] states that the equation 


ax? + by? +cz* =0, 


where abc is squarefree, has a nontrivial integral solution if and only 

if 

(i) a, b and c are not all of the same sign, and 

(ii) —bc, —ac and —ab are quadratic residues modulo |a|, |b] and 
|c| respectively. 

As we've already noted, Legendre tried to use this result to prove 

quadratic reciprocity. In this problem, we will treat one of the cases 

where he succeeded. Let p and q be primes which satisfy p = 1 mod 

4 and q =3 mod 4, and assume that (p/q) = —1 and (q/p) = 1. We 

will derive a contradiction as follows: 

(a) Use Legendre’s theorem to show that x? + py? — qz* =0 has a 
nontrivial integral solution. 

(b) Working modulo 4, show that x? + py* —qz* =0 has no non- 
trivial integral solutions. 

In [106, pp. 339-345], Weil explains why this argument works. 


Recall that the opposite of the form ax* + bxy +cy? is the form 
ax* —bxy +cy*. Prove that two forms are properly equivalent if 
and only if their opposites are. 


Verify that 14x7+ 10xy + 21y? and 9x?+2xy + 30y? compose to 
the four forms 126x?+74xy + 13y? and 126x?+38xy +5y?, and 


show that they all lie in different classes. Hint: use Exercises 2.6 and 
2:25. 


Let p be a prime number which is represented by forms f(x,y) and 

g(x,y) of the same discriminant. 

(a) Show that f(x,y) and g(x,y) are equivalent. Hint: use Lemma 
2.3, and examine the middle coefficient modulo p. 

(b) If f(x,y) =x? +ny*, and g(x,y) is reduced, then show that 
f(x,y) and g(x,y) are equal. 
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While genus theory and composition were implicit in Lagrange’s work, 
these concepts are still primarily linked to Gauss, and for good reason: 
he may not have been the first to use them, but he was the first to under- 
stand their astonishing depth and interconnection. In this section we will 
prove Gauss’ major results on composition and genus theory for the spe- 
cial case of positive definite forms. We will then apply this theory to our 
question concerning primes of the form x” + ny”, and we will also discuss 
Euler’s convenient numbers. These turn out to be those n’s for which each 
genus consists of a single class, and it is still not known exactly how many 
there are. The section will end with a discussion of Gauss’ Disquisitiones 
Arithmeticae. 


A. Composition and the Class Group 


The basic definition of composition was given in §2: if f(x,y) and g(x,y) 
are primitive positive definite forms of discriminant D, then a form F(x,y) 
of the same type is their composition provided that 


where 
Bj(X,y;Z,W) = ajxz+bjxw +cjyyz + diyw, 1=1,2 


are integral bilinear forms. Two forms can be composed in many differ- 
ent ways, and the resulting forms need not be properly equivalent. In §2 
we gave an example of two forms whose compositions lay in four distinct 
classes. So if we want a well-defined operation on classes of forms, we must 
somehow restrict the notion of composition. Gauss does this as follows: 
given the above composition data, he proves that 


(3.1) ayb2 — azb, = +f (1,0), a,C2 — a2c, = +g(1,0) 


(see [41, §235] or Exercise 3.1), and then he defines the composition to be 
a direct composition provided that both of the signs in (3.1) are +. 

The main result of Gauss’ theory of composition is that for a fixed dis- 
criminant, direct composition makes the set of classes of forms into a finite 
Abelian group [41, §§236—-40, 245 and 249]. Unfortunately, direct composi- 
tion is an awkward concept to work with, and Gauss’ proof of the group 
structure is long and complicated. So rather than follow Gauss, we will take 
a different approach to the study of composition. The basic idea is due to 
Dirichlet [28, Supplement X], though his treatment was clearly influenced 
by Legendre. Before giving Dirichlet’s definition, we will need the following 
lemma: 
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Lemma 3.2. Assume that f(x,y) = ax* + bxy +cy* and g(x,y) =a'x? + 
b'xy + c'y? have discriminant D and satisfy gcd(a,a',(b + b')/2) = 1 (since 
b and b' have the same parity, (b+ b')/2 is an integer). Then there is a 
unique integer B modulo 2aa' such that 


B=b mod 2a 
B=bD' mod 2a’ 
B* =D mod 4aa’. 


Proof. The first step is to put these congruences into a standard form. If a 
number B satisfies the first two, then 


B? —(b+b')B + bb’ = (B—b)(B —b')=0 mod 4aa’, 
so that the third congruence can be written as 
(b + b')B = bb' + D mod 4aa’. 
Dividing by 2, this becomes 
(3.3) (b + b’)/2-B = (bb’ + D)/2 mod 2aa’. 


If we multiply the first two congruences by a’ and a respectively and com- 
bine them with (3.3), we see that the three congruences in the statement of 
the lemma are equivalent to 


a’'- B =a'b mod 2aa' 
(3.4) a-B=ab' mod 2aa’ 
(b + b’)/2-B =(bb' + D)/2 mod 2aa’. 


The following lemma tells us about the solvability of these congruences: 


Lemma 3.5. Let p1,q1,.--;Pr,dr,m be numbers with gcd(pj,...,P;r,m) = 1. 
Then the congruences 


piB = qi mod m, P= Aysasf 
have a unique solution modulo m if and only if for all i,j = 1,...,r we have 
(3.6) Pi4j = Pjqi Mod m. 


Proof. See Exercise 3.3. Q.E.D. 
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Since we are assuming gcd(a,a',(b + b’)/2) = 1, the congruences (3.4) 
satisfy the gcd condition of the above lemma, and the compatibility condi- 
tions (3.6) are easy to verify (see Exercise 3.4). The existence and unique- 
ness of the desired B follow immediately. Q.E.D. 


We can now give Dirichlet’s definition of composition. Let f(x,y) = 
ax’ + bxy +cy? and g(x,y) =a'x* + b'xy +c'y? be primitive positive def- 
inite forms of discriminant D <0 which satisfy gcd(a,a’,(b + b’)/2)= 1. 
Then the Dirichlet composition of f(x,y) and g(x,y) is the form 


(3.7) F(x,y) =aa'x? + Bxy + ye 


4aa' 


where B is the integer determined by Lemma 3.2. The basic properties of 
F(x,y) are: 


Proposition 3.8. Given f(x,y) and g(x,y) as above, the Dirichlet composi- 
tion F(x,y) defined in (3.7) is a primitive positive definite form of discrim- 
inant D, and F(x,y) is the direct composition of f(x,y) and g(x,y) in the 
sense of (3.1). 


Proof. An easy calculation shows that F(x,y) has discriminant D, and the 
form is consequently positive definite. 

The next step is to prove that F(x,y) is the composition of f(x,y) and 
g(x,y). We will sketch the argument and leave the details to the reader. 
Let C = (B? — D)/4aa', so that F(x,y) = aa'x? + Bxy + Cy’. Then, using 
the first two congruences of Lemma 3.2, it is easy to show that f(x,y) 
and g(x,y) are properly equivalent to the forms ax? + Bxy +a'Cy” and 
a'x? + Bxy + aCy” respectively. However, for these last two forms one has 
the composition identity 


(ax* + Bxy +a'Cy’)(a'z* + Bzw +aCw’) = aa'X? + BXY + CY’, 


where X = xz—Czw and Y =axw+a'yz+Byw. It follows easily that 
F(x,y) is the composition of f(x,y) and g(x,y). With a little more effort, 
it can be checked that this is a direct composition in Gauss’ sense (3.1). 
The details of these arguments are covered in Exercise 3.5. 

It remains to show that F(x,y) is primitive, i.e., that its coefficients are 
relatively prime. Suppose that some prime p divided all of the coefficients. 
This would imply that p divided all numbers represented by F(x,y). Since 
F(x,y) is the composition of f(x,y) and g(x,y), this implies that p di- 
vides all numbers of the form f(x, y)g(z,w). But we know that f(x,y) and 
g(x,y) are primitive, and from here it is easy to derive a contradiction 
(see Exercise 3.5 for the details). This completes the proof of the proposi- 
tion. Q.E.D. 
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While Dirichlet composition is not as general as direct composition (not 
all direct compositions satisfy gcd(a,a’',(b + b’)/2) = 1), it is easier to use 
in practice since there is an explicit formula (3.7) for the composition. No- 
tice also that the congruence conditions in Lemma 3.2 are similar to the 
ones (2.32) used by Legendre. This is no accident, for when D = —4n and 
gcd(a,a’) = 1, Dirichlet’s formula reduces exactly to the one given by Leg- 
endre (see Exercise 3.6). 

We can now state our main result on composition: 


Theorem 3.9. Let D=0,1 mod 4 be negative, and let C(D) be the set of 
classes of primitive positive definite forms of discriminant D. Then Dirichlet 
composition induces a well-defined binary operation on C(D) which makes 
C(D) into a finite Abelian group whose order is the class number h(D). 

Furthermore, the identity element of C(D) is the class containing the prin- 
cipal form 


D 
fay if D=0mod 4 
2 = D> 
x ayy if D=1mod 4, 


and the inverse of the class containing the form ax? + bxy + cy? is the class 
containing ax* — bxy + cy?. 


Remarks. Some terminology is in order here. 
(i) The group C(D) is called the class group, though we will sometimes 
refer to C(D) as the form class group to distinguish it from the ideal 
class group to be defined later. 


(ii) The principal form of discriminant D was introduced in §2. The class it 
lies in is called the principal class. When D = —4n, the principal form 
is x7 + ny?. 

(iii) The form ax? — bxy + cy? is called the opposite of ax? + bxy + cy”, so 
that the opposite form gives the inverse under Dirichlet composition. 


Proof. Let f (x,y) = ax? + bxy + cy? and g(x,y) be forms of the given type. 
Using Lemmas 2.3 and 2.25, we can replace g(x,y) by a properly equivalent 
form a'x? + b'xy +c'y? where gcd(a,a’) = 1. Then the Dirichlet composi- 
tion of these forms is defined, which proves that Dirichlet composition is 
defined for any pair of classes in C(D). To get a group structure out of this, 
we must then prove that: 
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(i) This operation is well-defined on the level of classes, and 
(ii) The induced binary operation makes C(D) into an Abelian group. 


The proofs of (i) and (ii) can be done directly using the definition of Dirich- 
let composition (see Dirichlet [28, Supplement X] or Flath [36, §V.2]), but 
the argument is much easier using ideal class groups (to be studied in 87). 
We will therefore postpone this part of the proof until then. For now, we 
will assume that (i) and (ii) are true. 

Let’s next show that the principal class is the identity element of C(D). 
To compose the principal form with f(x,y) =ax* + bxy + cy’, first note 
that the gcd condition is clearly met, and thus the Dirichlet composition 
is defined. Then observe that B = b satisfies the conditions of Lemma 3.2, 
so that by formula (3.7), the Dirichlet composition F(x,y) reduces to the 
given form f(x,y). This proves that the principal class is the identity. 

Finally, given f(x,y) = ax? + bxy + cy’, its opposite is f'(x,y) = ax? — 
bxy +cy”. Since gcd(a,a,(b + (—b))/2) =a may be >1, we can’t apply 
Dirichlet composition directly. But if we use the proper equivalence (x,y) 
+» (—y,x), then we can replace f'(x, y) by g(x,y) = cx? + bxy + ay”. Since 
gcd(a,c,(b + b)/2) = gcd(a,c,b) = 1, we can apply Dirichlet’s formulas to 
f(x,y) and g(x,y). One checks easily that B = b satisfies the conditions of 
Lemma 3.2, so that the Dirichlet composition is acx? + bxy + y?. We leave 
it to the reader to show that this form is properly equivalent to the principal 
form (see Exercise 3.7). This completes the proof of the theorem. Q.E.D. 


We can now complete the discussion (begun in §2) of Legendre’s theory 
of composition. To prevent confusion, we will distinguish between a class 
(all forms properly equivalent to a given form) and a Lagrangian class (all 
forms equivalent to a given one). In Theorem 3.9, we studied the compo- 
sition of classes, while Legendre was concerned with the composition of 
Lagrangian classes. It is an easy exercise to show that the Lagrangian class 
of a form is the union of its class and the class of its opposite (see Exer- 
cise 3.8). By Theorem 3.9, this means that a Lagrangian class is the union 
of a class and its inverse in the class group C(D). Thus Legendre’s “op- 
eration” is the multiple-valued operation that multiplication induces on the 
set C(D)/~, where ~ is the equivalence relation that identifies x € C(D) 
with x! (see Exercise 3.9). In Legendre’s example (2.33), which dealt with 
forms of discriminant —164, we will see shortly that C(—164) ~ Z/8Z, and 
it is then an easy exercise to show that C(—164)/~ is isomorphic to the 
structure given in (2.34) (see Exercise 3.9). 

The elements of order < 2 in the class group C(D) play a special role in 
composition and genus theory. The reduced forms that lie in such classes 
are easy to find: 
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Lemma 3.10. A reduced form f(x,y) = ax? + bxy + cy* of discriminant D 
has order < 2 in the class group C(D) if and only if b= 0, a=bora=c. 


Proof. Let f'(x,y) be the opposite of f(x,y). By Theorem 3.9, the class 
of f(x,y) has order <2 if and only if the forms f(x,y) and f'(x,y) are 
properly equivalent. There are two cases to consider: 


|b| <a <c: Here, f'(x,y) is also reduced, so that by Theorem 2.8, 
the two forms are properly equivalent <> b= 0. 
a =b or a=c: In these cases, the proof of Theorem 2.8 shows that 
the two forms are always properly equivalent. 


The lemma now follows immediately. Q.E.D. 


For an example of how this works, consider Legendre’s example from §2 
of forms of discriminant —164. The reduced forms are listed in (2.33), and 
Lemma 3.10 shows that only 2x? + 2xy + 21y* has order 2. Since the class 
number is 8, the structure theorem for finite Abelian groups shows that the 
class group C(—164) must be Z/8Z. 

A surprising fact is that one doesn’t need to list the reduced forms in 
order to determine the number of elements of order 2 in the class group: 


Proposition 3.11. Let D = 0,1 mod 4 be negative, and let r be the number 
of odd primes dividing D. Define the number p as follows: if D = 1 mod 
4, then p=r, and if D=0mod 4, then D = —4n, where n> 0, and p is 
determined by the following table: 


n pt 
n=3mod4 r 
n= 1,2 mod 4 r+] 
n=4mod 8 r+i1 
n=O0Omod 8 r+2 


Then the class group C(D) has exactly 24! elements of order < 2. 


Proof. For simplicity, we will treat only the special case D = —4n, where 
n=1mod 4. Recall that a form of discriminant —4n may be written as 
ax* + 2bxy +cy. The basic idea of the proof is to count the number of 
reduced forms that satisfy 2b = 0, a = 2b or a = c, for by Lemma 3.10, this 
gives the number of classes of order < 2 in C(—4n). Since n is odd, note 
that r is the number of prime divisors of n. 

First, consider forms with 2b = 0, i.e., the forms ax* + cy”, where ac = 
n. Since a and c must be relatively prime and positive, there are 2’ choices 
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for a. To be reduced, we must also have a < c, so that we get 2’! reduced 
forms of this type. 

Next consider forms with a = 2b or a =c. Write n = bk, where b and k 
are relatively prime and 0<b<k. As above, there are 2’~! such b’s. Set 
c =(b+k)/2, and consider the form 2bx* + 2bxy + cy”. One computes 
that it has discriminant —4n, and since n=1 mod 4, its coefficients are 
relatively prime. We then get 2’—! reduced forms as follows: 


2b < c: Here, 2bx* + 2bxy + cy? is a reduced form. 

2b > c: Here, 2bx” + 2bxy +cy? is properly equivalent to 
cx? +2(c—b)xy + cy? via (x,y) (-y,x+y). 
Since 2b > c = 2(c — b) <c, the latter is reduced. 


The next step is to check that this process gives all reduced forms with 
a = 2b or a =c. We leave this to the reader (see Exercise 3.10). 

We thus have 2’! + 2’-! = 2" elements of order < 2, which shows that 
j4=r +1 in this case. The remaining cases are similar and are left to the 
reader (see Exercise 3.10, Flath [36, §V.5], Gauss [41, §257~—258] or Mathews 
[78, pp. 171-173]). Q.E.D. 


This is not the last we will see of the number yp, for it also plays an 
important role in genus theory. 


B. Genus Theory 


As in §2, we define two forms of discriminant D to be in the same genus 
if they represent the same values in (Z/DZ)*. Let’s recall the classifica- 
tion of genera given in §2. Consider the subgroups H c ker(xy) Cc (Z/DZ)*, 
where H consists of the values represented by the principal form, and 
xy :(Z/DZ)* — {+1} is defined by y([p]) = (D/p) for p/ D prime. Then 
the key result was Lemma 2.24, where we proved that the values repre- 
sented in (Z/DZ)* by a given form f(x,y) are a coset of H in ker(y). This 
coset determines which genus f(x,y) is in. 

Our first step is to relate this theory to the class group C(D). Since all 
forms in a given class represent the same numbers, sending the class to the 
coset of H C ker(y) it represents defines a map 


(3.12) ®: C(D) — ker(y)/H. 


Note that a given fiber ®~'(H'), H' € ker(y)/H, consists of all classes in 
a given genus (this is what we called the genus of H' in Theorem 2.26), 
and the image of ® may thus be identified with the set of genera. A crucial 
observation is that ® is a group homomorphism: 
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Lemma 3.13. The map ® which maps a class in C(D) to the coset of values 
represented in ker(x)/H is a group homomorphism. 


Proof. Let f(x,y) and g(x,y) be two forms of discriminant D taking values 
in the cosets H' and H" respectively. We can assume that their Dirichlet 
composition F(x,y) is defined, so that a product of values represented by 
f(x,y) and g(x, y) is represented by F(x,y). Then F(x, y) represents values 
in H'H", which proves that H'H” is the coset associated to the composi- 
tion of f(x,y) and g(x,y). Thus © is a homomorphism. Q.E.D. 


This lemma has the following consequences: 


Corollary 3.14. Let D =0,1 mod 4 be negative. Then: 


(i) All genera of forms of discriminant D consist of the same number of 
classes. 


(ii) The number of genera of forms of discriminant D is a power of two. 


Proof. The first statement follows since all fibers of a homomorphism have 
the same number of elements. To prove the second, first note that the sub- 
group H contains all squares in (Z/DZ)*. This is obvious because if f(x, y) 
is the principal form, then f(x,0) = x. Thus every element in ker(y)/H 
has order < 2, and it follows from the structure theorem for finite Abelian 
groups that ker(y)/H ~ {+1}” for some m. Thus the image of ®, being 
a subgroup of ker(y)/H, has order 2* for some k. Since 6(C(D)) tells us 
the number of genera, we are done. Q.E.D. 


Note also that 6(C(D)) gives a natural group structure on the set of 
genera, or as Gauss would say, one can define the composition of genera 
[41, §§246-247]. 

These elementary facts are nice, but they aren’t the whole story. The 
real depth of the relation between composition and genera is indicated by 
the following theorem: 


Theorem 3.15. Let D = 0,1 mod 4 be negative. Then: 

(i) There are 2#~1 genera of forms of discriminant D, where ps is the number 
defined in Proposition 3.11. 

(ii) The principal genus (the genus containing the principal form) consists of 
the classes in C(D)*, the subgroup of squares in the class group C(D). 
Thus every form in the principal genus arises by duplication. 


Proof. We first need to give a more efficient method for determining when 
two forms are in the same genus. The basic idea is to use certain assigned 
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characters, which are defined as follows. Let pj,...,p, be the distinct odd 
primes dividing D. Then consider the functions: 


Xi(a) = (=) defined for a prime to pj, 1 = 1,...,r 


l 
§(a) = (-1)@-Y? defined for a odd 
e(a)=(-1)@-)/8 defined for a odd. 


Rather than using all of these functions, we assign only certain ones, de- 
pending on the discriminant D. When D = 1 mod 4, we define ¥j,..., V7 to 
be the assigned characters, and when D = 0 mod 4, we write D = —4n, and 
then the assigned characters are defined by the following table: 


assigned characters 


n=3 mod 4 Vi sitdng Xe 

n=1mod4 Viseaay Vey 0 
n=2 mod 8 V1y-++> Xr, O€ 
n=6mod 8 Vives Xr € 
n=4mod 8 Vises ese 
n=O0Omod8 V 13803 Vrs Oy € 


Note that the number of assigned characters is exactly the number p given 
in Proposition 3.11. It is easy to see that the assigned characters give a 
homomorphism 


(3.16) YW : (Z/DZ)* — {+1}*. 
The crucial property of is the following: 
Lemma 3.17. The homomorphism VW : (Z/DZ)* — {+1}# of (3.16) is sur- 


Jective and its kernel is the subgroup H of values represented by the principal 
form. Thus V induces an isomorphism 


(Z/DZ)*/H > {+1}". 


Proof. When D =1 mod 4, the proof is quite easy. First note that if p is 
an odd prime, then for any m>1, the Legendre symbol (a/p) induces a 
surjective homomorphism 


(3.18) (-/p):(Z/p™Z)" — {£1} 


whose kernel is exactly the subgroup of squares of (Z/p”Z)* (see Exer- 
cise 3.11). Now let D = —[]!_, p?" be the prime factorization of D. The 
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Chinese Remainder Theorem tells us that 


lb 
(Z/DZ)* > |[@/pi"2)", 
1=i 


so that the map YW can be interpreted as the map 


im 

[[@/erzy — {414 

1=i 
given by ([@;],.-.,[@y]) +> ((41/P1),---,(@yn/Py))- By the analysis of (3.18), it 
follows that Y is surjective and its kernel is exactly the subgroup of squares 
of (Z/DZ)*. By part (c) of Exercise 2.17, this equals the subgroup H of 
values represented by the principal form x? + xy +((1— D)/4)y?, and we 
are done. 

The proof is more complicated when D = —4n, mainly because the sub- 
group H represented by x? + ny? may be slightly larger than the subgroup 
of squares. However, the above argument using the Chinese Remainder 
Theorem can be adapted to this case. The odd primes dividing n are no 
problem, but 2 causes considerable difficulty (see Exercise 3.11 for the de- 


tails). Q.E.D. 


We can now prove Theorem 3.15. To prove (i), note that ker(y) has in- 
dex 2 in (Z/DZ)*. By Lemma 3.17, it follows that ker(y)/H has order 247}, 
We know that the number of genera is the order of ®(C(D)) C ker(x)/H, 
so that it suffices to show ®(C(D)) = ker(y)/H. Since © maps a class to 
the coset of values it represents, we need to show that every congruence 
class in ker(x) contains a number represented by a form of discriminant 
D. This is easy: Dirichlet’s theorem on primes in arithmetic progressions 
tells us that any class in ker(y) contains an odd prime p. But [p] € ker(y) 
means that x([p]) = (D/p) = 1, so that by Lemma 2.5, p is represented by 
a form of discriminant D, and (i) is proved. 

To prove (ii), let C denote the class group C(D). Since ® : C — ker(y)/ 
H ~ {+1}#—! is a homomorphism, it follows that C? C ker(®), and we get 
an induced map 


(3.19) C/C? — {+1} #7}, 
We compute the order of C/C? as follows. The squaring map from C to 
itself gives a short exact sequence 


0S: Ces C750 


where Cp is the subgroup of elements of order < 2. It follows that the index 
[C : C?] equals the order of Cy, which is 24~! by Proposition 3.11. 

Thus, in map given in (3.19), both the domain and the range have the 
same order. But from (i) we know that the map is surjective, so that it 
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must be an isomorphism. Hence C? is exactly the kernel of the map ®. 
Since ker(®) consists of the classes in the principal genus, the theorem is 
proved. Q.E.D. 


We have now proved the main theorems of genus theory for primitive 
positive definite forms. These results are due to Gauss and appear in the 
fifth section of Disquisitiones Arithmeticae [41, §§229-287]. Gauss’ treat- 
ment is more general than ours, for he considers both the definite and 
indefinite forms, and in particular, he shows that Proposition 3.11 and The- 
orem 3.15 are true for any nonsquare discriminant, positive or negative. His 
proofs are quite difficult, and at the end of this long series of arguments, 
Gauss makes the following comment about genus theory [41, §287]: 


these theorems are among the most beautiful in the theory of bi- 
nary forms, especially because, despite their extreme simplicity, they 
are so profound that a rigorous demonstration requires the help of 
many other investigations. 


Besides these theorems, there is another component to Gauss’ genus the- 
ory not mentioned so far: Gauss’ second proof of quadratic reciprocity [41, 
§262], which uses the genus theory developed above. We will not discuss 
Gauss’ proof since it uses forms of positive discriminant, though the main 
ideas of the proof are outlined in Exercises 3.12 and 3.13. Many people 
regard this as the deepest of Gauss’ many proofs of quadratic reciprocity. 

Gauss’ approach to genus theory is somewhat different from ours. In 
Disquisitiones, genera are defined in terms of the assigned characters intro- 
duced in the proof of Theorem 3.15. Given a form f(x,y) of discriminant 
D, let f(x,y) represent a number a relatively prime to D. If the p as- 
signed characters are evaluated at a, then Gauss calls the resulting p-tuple 
the complete character of f(x,y), and he defines two forms of discriminant 
D to be in the same genus if they have the same complete character [41, 
§231]. The following lemma shows that this is equivalent to our previous 
definition of genus: 


Lemma 3.20. The complete character depends only on the form f(x,y), and 
two forms of discriminant D lie in the same genus (as defined in §2) if and 
only if they have the same complete character. 


Proof. Suppose that f(x,y) represents a, where a is relatively prime to D. 
Then Gauss’ complete character is nothing other than Y([a]), where WV is 
the map defined in (3.16). By Lemma 2.24, the possible a’s lie in a coset 
H' of A in (Z/DZ)*, and this coset determines the genus of f(x,y). Using 
Lemma 3.17, it follows that the complete character is uniquely determined 
by H’, and Lemma 3.20 is proved. Q.E.D. 
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We should mention that Gauss’ use of the word “character” is where the 
modern term “group character” comes from. Also, it is interesting to note 
that Gauss never mentions the connection between his characters and La- 
grange’s implicit genus theory. While Gauss’ characters make it easy to de- 
cide when two forms belong to the same genus (see Exercise 3.14 for an ex- 
ample), they are not very intuitive. Unfortunately, most of Gauss’ successors 
followed his presentation of genus theory, so that readers were presented 
with long lists of characters and no motivation whatsoever. The simple idea 
of grouping forms according to the congruence classes they represent was 
usually not mentioned. This happens in Dirichlet [28, pp. 313-316] and in 
Mathews [78, pp. 132-136], although Smith [95, pp. 202-207] does discuss 
congruence classes. 

So far we have discussed two ways to formulate genera, Lagrange’s and 
Gauss’. There are many other ways to state the definition, but before we 
can discuss them, we need some terminology. We say that two forms f(x, y) 
and g(x,y) are equivalent over a ring R if there is a matrix (? 4) € GL(2,R) 
such that f(x,y) =g(px+qy,rx+sy). If R=Z/mZ, we say that f(x,y) 
and g(x,y) are equivalent modulo m. We then have the following theorem: 


Theorem 3.21. Let f(x,y) and g(x,y) be primitive forms of discriminant 
D #0, positive definite if D <0. Then the following statements are equiva- 
lent: 
(i) f(x,y) and g(x,y) are in the same genus, i.e. they represent the same 
values in (Z/DZ)*. 

(ii) f(x,y) and g(x,y) represent the same values in (Z/mZ)* for all non- 
zero integers m. 

(iii) f(x,y) and g(x,y) are equivalent modulo m for all nonzero inte- 
gers m. 

(iv) f(x,y) and g(x,y) are equivalent over the p-adic integers Z, for all 
primes p. 

(v) f(x,y) and g(x,y) are equivalent over Q via a matrix in GL(2, Q) whose 
entries have denominators prime to 2D. 

(vi) f(x,y) and g(x,y) are equivalent over Q without essential denominator, 
ie, given any nonzero m, a matrix in GL(2,Q) can be found which 
takes one form to the other and whose entries have denominators prime 
to m. 


Proof. It is easy to prove (vi) => (iii) > (ii) > (i) and (vi) > (v) => (i) (see 
Exercise 3.15), and (iii) < (iv) is a standard argument using the compact- 
ness of Z, (see Borevich and Shafarevich [8, p. 41] for an analagous case). 
A proof of (i) = (iii) appears in Hua [57, §12.5, Exercise 4], and (i) > (iv) 
is in Jones [63, pp. 103-104]. Finally, the implication (iv) > (vi) uses the 
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Hasse principle for the equivalence of forms over Q and may be found in 
Jones [63, Theorem 40] or Siegel [91]. Q.E.D. 


Some modern texts give yet a different definition, saying that two forms 
are in the same genus if and only if they are equivalent over Q (see, for ex- 
ample, Borevich and Shafarevich [8, p. 241]). This characterization doesn’t 
hold in general (x* + 18y? and 2x? + 9y? are rationally equivalent but be- 
long to different genera—see Exercise 3.16), but it does work for field 
discriminants, which means that D=1mod 4, D squarefree, or D = 4k, 
k #1mod 4, k squarefree (see Exercise 3.17—we wil] study such discrim- 
inants in more detail in §5). According to Dickson [26, Vol. III, pp. 216 
and 236], Eisenstein suggested in 1852 that genera could be defined using 
rational equivalence, and only later, in 1867, did Smith point out that extra 
assumptions are needed on the denominators. 


C. p= x*+ny? and Euler’s Convenient Numbers 


Our discussion of genus theory has distracted us from our problem of deter- 
mining when a prime p can be written as x* + ny”. Recall from Corollary 
2.27 that genus theory gives us congruence conditions for p to be repre- 
sented by a reduced form in the principal genus. The nicest case is when 
every genus of discriminant —4n consists of a single class, for then we get 
congruence conditions that characterize p = x? +ny? (this is what made 
the examples in (2.28) work). Let’s see if the genus theory developed in this 
section can shed any light on this special case. We have the following result: 


Theorem 3.22. Let n be a positive integer. Then the following statements are 

equivalent: 

(i) Every genus of forms of discriminant —4n consists of a single class. 

(ii) If ax* + bxy + cy? is a reduced form of discriminant —4n, then either 
b=0,a=bora=c. 

(iii) Two forms of discriminant —4n are equivalent if and only if they are 
properly equivalent. 

(iv) The class group C(—4n) is isomorphic to (Z/2Z)” for some integer m. 

(v) The class number h(—4n) equals 2"~', where y is as in Proposition 
3.11 


Proof. We will prove (i) => (ii) > (ili) > (iv) > (v) > Gi). Let C denote 
the class group C(—4n). 

Since the principal genus is C* by Theorem 3.15, (i) implies that C? = 
{1}, so that every element of C has order <2. Then Lemma 3.10 shows 
that (i) => (ii). 
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Next assume (ii), and suppose that two forms of discriminant —4n are 
equivalent. By Exercise 3.8, we know that one is properly equivalent to the 
other or its opposite. We may assume that the forms are reduced, so that 
by assumption b = 0, a = b or a =c. The proof of Theorem 2.8 shows that 
forms of this type are always properly equivalent to their opposites, so that 
the forms are properly equivalent. This proves (ii) => (iii). 

Recall that any form is equivalent to its opposite via (x,y) (x,—y). 
Thus (iii) implies that any form and its opposite lie in the same class in C. 
Since the opposite gives the inverse in C by Theorem 3.9, we see that every 
class is its own inverse. The structure theorem for finite Abelian groups 
shows that the only groups with this property are (Z/2Z)”, and (iii) > (iv) 
is proved. 

Next, Theorem 3.15 implies that the number of genera is [C : C?] = 
24-1 | so that 


(3.23) h(—4n) = |C| =[C: C7}]C?| = 24". 


If (iv) holds, then C? = {1}, and then (v) follows immediately from (3.23). 
Finally, given (v), (3.23) implies that C? = {1}, so that by Theorem 3.15, the 
principal genus consists of a single class. Since every genus consists of the 
same number of classes, (i) follows, and the theorem is proved. Q.E.D. 


Notice how this theorem runs the full gamut of what we’ve done so far: 
the conditions of Theorem 3.22 involve genera, reduced forms, the class 
number, the structure of the class group and the relation between equiva- 
Jence and proper equivalence. For computational purposes, the last condi- 
tion (v) is especially useful, for it only requires knowing the class number. 
This makes it much easier to verify that the examples in (2.28) have only 
one class per genus. 

Near the end of the fifth section of Disquisitiones, Gauss lists 65 dis- 
criminants that satisfy this theorem [41, §303]. Grouped according to class 
number, they are: 


h(—4n) n’s with one class per genus 
1 1,2,3,4,7 
2 5,6,8,9, 10, 12,13, 15, 16, 18,22,25,28,37,58 
4 21,24,30,33,40,42,45,48,57,60,70,72,78,85,88,93, 102, 112 
130, 133,177, 190,232,253 
8 105,120, 165, 168,210,240,273,280,3 12,330,345,357,385 
408,462,520,760 


16 840,1320,1365,1848 
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Gauss was interested in these 65 n’s not for their relation to the question of 
when p = x* + ny’, but rather because they had been discovered earlier by 
Euler in a different context. Euler called a number n a convenient number 
(numerus idoneus) if it satisfies the following criterion: 


Let m be an odd number relatively prime to m which is properly 
represented by x* + ny. If the equation m = x? + ny” has only one 
solution with x,y > 0, then m is a prime number. 


Euler was interested in convenient numbers because they helped him find 
large primes. For example, working with n = 1848, he was able to show that 


18,518,809 = 1977 + 1848- 1007 


is prime, a large one for Euler’s time. Convenient numbers are a fascinating 
topic, and the reader should consult Frei [38] or Weil [106, pp. 219-226] for 
a fuller discussion. We will confine ourselves to the following remarkable 
observation of Gauss: 


Proposition 3.24. A positive integer n is a convenient number if and only if 
for forms of discriminant —4n, every genus consists of a single class. 


Proof. We begin with a lemma: 


Lemma 3.25. Let m be a positive odd number relatively prime to n> 1. 
Then the number of ways that m is properly represented by a reduced form of 


discriminant —4n Is 
—n 
2, 1+ { — } }. 
H(:+(>')) 


p|m 


Proof. See Exercise 3.20 or Landau [71, Vol. 1, p. 144]. Q.E.D. 


This classical lemma belongs to an area of quadratic forms that we have 
ignored, namely the study of the number of representations of a number by 
a form. To see what this has to do with genus theory, note that two forms 
representing m must lie in the same genus, for the values they represent in 
(Z/4nZ)* are not disjoint. We thus get the following corollary of Lemma 
323: 


Corollary 3.26. Let m be properly represented by a primitive positive definite 
form f(x,y) of discriminant —4n, n> 1, and assume that m is odd and 
relatively prime to n. If r denotes the number of prime divisors of m, then m 
is properly represented in exactly 2'*1 ways by a reduced form in the genus of 
f(x,y). Q.E.D. 
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Now we can prove the proposition. First, assume that there is only one 
class per genus. If m is properly represented by x? + ny” and m = x? + ny? 
has a unique solution when x, y > 0, then we need to prove that m is prime. 
The above corollary shows that m is properly represented by x* + ny? in 
2'*1 ways since x* + ny” is the only reduced form in its genus. At least 
2’-! of these representations satisfy x,y >0, and then our assumption on 
m implies that r = 1, i.e., m is a prime power p’%. If a>2, then Lemma 
3.25 shows that p*~? also has a proper representation, and it follows easily 
that m has at least two representations in nonnegative integers. This con- 
tradiction proves that m is prime, and hence n is a convenient number. 

Conversely, assume that 1 is convenient. Let f(x,y) be a form of dis- 
criminant —4n, and let g(x,y) be the composition of f(x, y) with itself. We 
can assume that g(x,y) is reduced, and it suffices to show that g(x,y) = 
x? + ny” (for then every element in the class group has order < 2, which 
implies one class per genus by Theorem 3.22). 

Assume that g(x,y) # x* + ny’, and let p and q be distinct odd primes 
not dividing n which are represented by f(x,y). (In 89 we will prove that 
f(x, y) represents infinitely many primes.) Then g(x, y) represents pq, and 
formula (2.31) shows that x? +ny* does too. By Corollary 3.26, pq has 
only 8 proper representations by reduced forms of discriminant —4n. At 
least one comes from g(x,y), leaving at most 7 for x? + ny’. It follows that 
pq is uniquely represented by x? + ny? when we restrict to nonnegative 
integers. This contradicts our assumption that n is convenient. Q.E.D. 


Gauss never states Proposition 3.24 formally, but it is implicit in the 
methods he discusses for factoring large numbers [41, §§329-334]. 

In §2 we asked how many such n’s there were. Gauss suggests [41, §303] 
that the 65 given by Euler are the only ones. In 1934 Chowla [17] proved 
that the number of such n’s is finite, and by 1973 it was known that Euler’s 
list is complete except for possibly one more n (see Weinberger [108}). 
Whether or not this last n actually exists is stil] an open question. 

From our point of view, the upshot is that there are only finitely many 
theorems like (2.28) where p = x? + ny? is characterized by simple congru- 
ences modulo 4n. Thus genus theory cannot solve our basic question for 
all n. In some cases, such as D = —108, it’s completely useless (all three 
reduced forms x* + 27y* and 4x7+2xy +7y? lie in the same genus), and 
even when it’s a partial help, such as D = —56, we're still stuck (we can sep- 
arate x* + 14y? and 2x?+7y? from 3x?2+2xy + 5y’, but we can’t distin- 
guish between the first two). And notice that by part (iii) of Theorem 3.21, 
forms in the same genus are equivalent modulo m for all m #0, so that 
no matter how m is chosen, there are no congruences p =a,b,c,... mod m 
which can separate forms in the same genus. Something new is needed. In 
1833, Dirichlet described the situation as follows [27, Vol. I, p. 201]: 
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there lies in the mentioned [genus] theory an incompleteness, in 
that it certainly shows that a prime number, as soon as it is contained 
in a linear form [congruence class], necessarily must assume one of 
the corresponding quadratic forms, only without giving any a priori 
method for deciding which quadratic form it will be. ... It becomes 
clear that the characteristic property of a single quadratic form be- 
longing to a group [genus] cannot be expressed through the prime 
numbers in the corresponding linear forms, but necessarily must be 
expressed by another theory not depending on the elements at hand. 


As we already know from Euler’s conjectures concerning x? + 27y? and 
x* + 64y* (see (1.22) and (1.23)), the new theory we’re seeking involves 
residues of higher powers. Gauss rediscovered Euler’s conjectures in 1805, 
and he proved them in the course of his work on cubic and biquadratic reci- 
procity. In §4 we will give careful statements of these reciprocity theorems 
and show how they can be used to prove Euler’s conjectures. 


D. Disquisitiones Arithmeticae 


Gauss’ Disquisitiones Arithmeticae covers a wide range of topics in num- 
ber theory, including congruences, quadratic reciprocity, quadratic forms 
(in two and three variables), and the cyclotomic fields Q((n), Gn = e2™/". 
There are several excellent accounts of what’s in Disquisitiones, notably 
Buhler [13, Chapter 3], Bachmann [42, Vol. X.2.1, pp. 8-40] and Rieger 
[84], and translations into English and German are available (see item [41] 
in the references). Rather than try to survey the whole book, we will in- 
stead make some comments on Gauss’ treatment of quadratic reciprocity 
and quadratic forms, for in each case he does things slightly different from 
the theory presented in §§2 and 3. 

Disquisitiones contains the first published (valid) proof of the law of qua- 
dratic reciprocity. One surprise is that Gauss never uses the term “qua- 
dratic reciprocity”. Instead, Gauss uses the phrase “fundamental theorem”, 
which he explains as follows [41, §131]: 


Since almost everything that can be said about quadratic residues 
depends on this theorem, the term fundamental theorem which we will 
use from now on should be acceptable. 


In the more informal setting of his mathematical diary, Gauss uses the term 
“golden theorem” to describe his high regard for quadratic reciprocity [42, 
Vol. X.1, entries 16, 23 and 30 on pp. 496-501] (see Gray [44] for an English 
translation). Likewise absent from Disquisitiones is the Legendre symbol, 
for Gauss uses the notation aRb or aNb to indicate whether or not a was a 
quadratic residue modulo b [41, §131]. (The Legendre symbol does appear 


64 §3. GAUSS, COMPOSITION AND GENERA 


in some of his handwritten notes—see [42, Vol. X.1, p. 53|—but this doesn’t 
happen very often.) 

One reason why Gauss ignored Legendre’s terminology is that Gauss dis- 
covered quadratic reciprocity independent of his predecessors. In a marginal 
note in his copy of Disquisitiones, Gauss states that “we discovered the fun- 
damental theorem by induction in March 1795. We found our first proof, 
the one contained in this section, April 1796” [41, p. 468, English editions] 
or [42, Vol. I, p. 476]. In 1795 Gauss was still a student at the Collegium 
Carolinum in Brunswick, and only later, while at G6ttingen, did he discover 
the earlier work of Euler and Legendre on reciprocity. 

Gauss’ proof from April 1796 appears in §§135-144 of Disquisitiones. 
The theorem is stated in two forms: the usual version of quadratic reci- 
procity appears in [41, 8131], and the more general version that holds for 
the Jacobi symbol] (which we used in the proof of Lemma 1.14) is given in 
[41, §133]. The proof uses complete induction on the prime p, and there 
are many cases to consider, some of which use reciprocity for the Jacobi 
symbol (which would hold for numbers smaller than p). As Gauss wrote in 
1808, the proof “proceeds by Jaborious steps and is burdened by detailed 
calculations” [42, Vol. II, p. 4]. In 1857, Dirichlet used the Jacobi symbol 
to simplify the proof and reduce the number of cases to just two [27, Vol. 
II, pp. 121-138]. It is interesting to note that what Gauss proves in Disqui- 
sitiones is actually a bit more general than the usual statment of quadratic 
reciprocity for the Jacobi symbol (see Exercise 3.24). Thus, when Jacobi 
introduced the Jacobi symbol in 1837 [61, Vol. VI, p. 262], he was simply 
giving a nicer but Jess general formulation of what was already in Disquisi- 
tiones. 

As we mentioned in our discussion of genus theory, Disquisitiones also 
contains a second proof of reciprocity that is quite different in nature. The 
first proof is awkward but elementary, while the second uses Gauss’ genus 
theory and is much more sophisticated. 

Gauss’ treatment of quadratic forms occupies the fifth (and longest) sec- 
tion of Disquisitiones. It is not easy reading, for many of the arguments 
are very complicated. Fortunately, there are more modern texts that cover 
pretty much the same material (in particular, see either Flath [36] or Math- 
ews [78]). Gauss starts with the case of positive definite forms, and the 
theory he develops is similar to the first part of §2. Then, in [41, 8182], 
he gives some applications to number theory, which are introduced as 
follows: 


Let us now consider certain particular cases both because of their 
remarkable elegance and because of the painstaking work done on 
them by Euler, who endowed them with an almost classical distinc- 
tion. 
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As might be expected, Gauss first proves Fermat’s three theorems (1.1), 
and then he proves Euler’s conjecture for p = x* + 5y* using Lagrange’s 
implicit genus theory (his proof is similar to what we did in (2.19), (2.20) 
and (2.22)). Interestingly enough, Gauss never mentions the relation be- 
tween this example and genus theory. In contrast to Lagrange and Legen- 
dre, Gauss works out few examples. His one comment is that “the reader 
can derive this proposition [concerning x? + 5y?] and an infinite number 
of other particular ones from the preceding and the following discussions” 
[41, §182}. 

Gauss always assumed that the middle coefficient was even, so that his 
forms were written f(x, y) = ax” + 2bxy + cy’. He used the ordered triple 
(a,b,c) to denote f(x,y) [41, 8153], and he defined its determinant to be 
b* —ac [41, §154]. Note that the discriminant of ax? + 2bxy + cy? is just 4 
times Gauss’ determinant. 

Gauss did not assume that the coefficients of his forms were relatively 
prime, and he organized forms into orders according to the common di- 
visors of the coefficients. More precisely, the forms ax* + 2bxy + cy? and 
a'x? + 2b'xy +c'y* are in the same order provided that gcd(a,b,c) = 
gcd(a’,b’,c') and gcd(a,2b,c) = gcd(a',2b',c') [41, §226]. To get a better 
idea of how this works, consider a primitive quadratic form ax? + bxy + 
cy”. Here, a, b and c are relatively prime integers, and b may be even or 
odd. We can fit this form into Gauss’ scheme as follows: 


b is even: Then b = 2b’, and ax? + 2b'xy + cy” satisfies gcd(a, b’,c) 
= gced(a,2b',c) = 1. Gauss called forms in this order 
properly primitive. 

b is odd: Then 2ax? + 2bxy + 2cy” satisfies gcd(2a, b, 2c) = 1, 
gcd(2a, 2b, 2c) = 2. Gauss called forms in this order 


improperly primitive. 


So all primitive forms are present, though the ones with b odd appear in 
disguised form. This doesn’t affect the class number, but it does cause prob- 
Jems with composition. 

Gauss’ classification of forms thus consists of orders, which are made 
up of genera, which are in turn made up of classes. This is reminiscent of 
the Linnean classification in biology, where the categories are class, order, 
family, genus and species. Gauss’ terms al] appear on Linneaus’ list, and it 
is thus likely that this is where Gauss got his terminology. Since our current 
term “equivalence class” comes from Gauss’ example of classes of properly 
equivalent forms, we see that there is an unexpected link between modern 
set theory and eighteenth-century biology. 
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Finally, let’s make one comment about composition. Gauss’ theory of 
composition has always been one of the more difficult parts of Disquisi- 
tiones to read, and part of the reason is the complexity of Gauss’ presenta- 
tion. For example, the proof that composition is associative involves check- 
ing that 28 equations are satisfied [41, §240]. But a multiplicity of equations 
is not the only difficulty here—there is also an interesting conceptual issue. 
Namely, in order to define the class group, notice that Gauss has to put 
the structure of an abstract Abelian group on a set of equivalence classes. 
Considering that we’re talking about the year 1801, this is an amazing level 
of abstraction. But then, Disquisitiones is an amazing book. 


E. Exercises 


3.1. 


3.2. 


3.3. 


3.4. 


3.5. 


Assume that F(x, y) = Ax* + Bxy + Cy? is the composition of the 


forms f (x,y) = ax? + bxy + cy? and g(x,y) = a'x* +b'xy +c'y? via 
f(x, y)g(z,w) = F@ xz +b xw +c1yz + diyw,arxz 
+ boaxw +c2yz+ dzyw), 


and suppose that all three forms have discriminant D # 0. The goal 
of this exercise is to prove Gauss’ formulas (3.1). 


(a) By specializing the variables x, y, z and w, prove that 
aa’ = Aa? + Bayaz + Cas 
ac' = Ab? + Bb,b2 + Cbs 
ab’ = 2Aa,b; + B(a,bz + azb) + 2Carb>. 
Hint: for the first one, try x = z= 1 and y=w=0. 
(b) Prove that a = +(aib2 — azb;). Hint: prove that 
a?(b!” — 4a'c!) = (a,bz — ab, °(B? — 4AC). 


(c) Prove that a’ = +(aic2 — a2C1). 

Show that the compositions given in (2.30) and (2.31) are not direct 
compositions. 

Prove Lemma 3.5. Hint: there are a,qj,...,a, such thatam+ )°i_, 
aj pi = 1. 

Verify that the congruences (3.4) satisfy the compatibility conditions 
of Lemma 3.5. 

Let f(x,y) = ax? +bxy +cy’, g(x,y) =a'x?+b'xy+c'y* and B 
be as in Lemma 3.2. We want to show that aa’x?+Bxy+Cy’, 
C = (B* — D)/4aa’, is the direct composition of f(x,y) and g(x,y). 
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(a) Show that f(x,y) (resp. g(x, y)) is properly equivalent to ax? + 
Bxy +a'Cy? (resp. a'x* + Bxy + aCy’). Hint: for f(x,y), use 
B=bmod 2a. 

(b) Let X = xz— Cyw and Y = axw + a'yz + Byw. Then show that 


(ax? + Bxy +a'Cy’)(a'z? + Bzw + aCw’) 
= aa'X?+BXY+CY’. 


Furthermore, show that this is a direct composition in the sense 
of (3.1). Hint: first show that 


(ax +(B+VD)y/2)(a'z + (B + VD)w/2) 
= aa'X +(B+VD)Y/2. 


(c) Suppose that a form G(x,y) is the direct composition of forms 
h(x,y) and k(x,y). If h(x, y) is properly equivalent to h(x,y), 
then show that G(x, y) is also the direct composition of h(x, y) 
and k(x,y). 

(d) Use (a)-(c) to show that the Dirichlet composition is a direct 
composition. 

(e) Prove that the Dirichlet composition of primitive forms is prim- 
itive. Hint: since F(x,y) represents any product f(x, y)g(z,w), 
show that the gcd of all numbers represented by F(x,y) is 1, 
and conclude that F(x, y) is primitive. 


3.6. This problem studies the relation between Legendre’s and Dirichlet’s 

formulas for composition. 

(a) Suppose that f(x,y) =ax*+2bxy +cy? and g(x,y) =a'x?+ 
2b'xy +c'y” have the same discriminant and satisfy gcd(a, a’) = 
1. Show that the Dirichlet composition of these forms is the one 
given by Legendre’s formula with both signs + in (2.32). 

(b) In Exercise 2.26, we saw that the forms 14x? + 10xy + 21y? and 
9x? + 2xy + 30y* compose to 126x* + 74xy + 13y? and 126x? + 
38xy +5y*. Which one of these four is the direct composition 
of the original two forms? 


3.7. Show that acx* + bxy + y* is properly equivalent to the principal 
form. 


3.8. For us, a class consists of all forms properly equivalent to a given 
form. Let a Lagrangian class (this terminology is due to Weil [106, 
p. 319]) consist of all forms equivalent (properly or improperly) to a 
given form. 
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3.9. 


3.10. 


3.11. 
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(a) Prove that the Lagrangian class of a form is the union of the 
class of the form and the class of its opposite. 
(b) Show that the following statements are equivalent: 
(i) The Lagrangian class of f(x,y) equals the class of f(x,y). 
(ii) f(x,y) is properly equivalent to its opposite. 
(iii) f(x,y) is properly and improperly equivalent to itself. 
(iv) The class of f(x,y) has order < 2 in the class group. 


In this problem we will describe the “almost” group structure given 

by Legendre’s theory of composition. Let G be an Abelian group 

and let ~ be the equivalence relation which identifies a~! and a for 
allaeG. 

(a) Show that multiplication on G induces an operation on G/~ 
which takes either one or two values. Furthermore, if a, be G 
and [a], [b] are their classes in G/~, then show that [a]-[b] 
takes on only one value if and only if a, b or ab has order < 2 
in G. 

(b) If G is cyclic of order 8, show that G/~ is isomorphic (in the 
obvious sense) to the structure given by (2.33) and (2.34). 

(c) If C(D) is the class group of forms of discriminant D, show that 
C(D)/~ can be naturally identified with the set of Lagrangian 
classes of forms of discriminant D (see Exercise 3.8). 


Complete the proof of Proposition 3.11 for the case D = —4n, n= 
1 mod 4, and prove all of the remaining cases. 


This exercise is concerned with the proof of Lemma 3.17. 

(a) Prove that the map (3.18) is surjective and its kernel is the sub- 
group of squares. 

(b) We next want to prove the Jemma when D = —4n, n> 0. Write 
n = 24m where m is odd, so that we have an isomorphism 


(Z/DZ)* ~ (Z/2°*7Z)* x (Z/mZ)*. 


Let H denote the subgroup of values represented by x? + ny?. 

(i) Show that H = H, x (Z/mZ)*? where H, = H N(Z/ 
22+?Z)* x {I1}). 

(ii) When a > 4, show that H, = (Z/27*+*Z)**, where Hj is as 
in (i). Hint: the description of (Z/27*?Z)* given in Ireland 
and Rosen [59, §4.1] will be useful. 

(iii) Prove Lemma 3.17 when D = 0 mod 4. Hint: treat the cases 
a= 0, 1, 2,3 and > 4 separately. See also Ireland and Rosen 
[59, §4.1]. 


3.12. 


3.13. 
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In Exercises 3.12 and 3.13 we will sketch Gauss’ second proof of 
quadratic reciprocity. There are two parts to the proof: first, one 
shows, without using quadratic reciprocity, that for any nonsquare 
discriminant D, 


(+) the number of genera of forms of discriminant D is < 2#~', 


where pt is defined in Proposition 3.11, and second, one shows that 
(*) implies quadratic reciprocity. This exercise will do the first step, 
and Exercise 3.13 will take care of the second. 

We proved in Exercise 2.10 that when D> 0 is not a perfect 
square, there are only finitely many proper equivalence classes of 
primitive forms of discriminant D. The set of equivalence classes 
will be denoted C(D), and as in the positive definite case, C(D) 
becomes a finite Abelian group under Dirichlet composition (we will 
prove this in the exercises to §7). We will assume that Proposition 
3.11 and Theorem 3.15 hold for all nonsquare discriminants D. This 
is where we pay the price for restricting ourselves to positive definite 
forms—the proofs in the text only work for D <0. For proofs of 
these theorems when D > 0, see Flath [36, Chapter V], Gauss [41, 
§8257-258] or Mathews [78, pp. 171-173]. 

To prove (*), let D be any nonsquare discriminant, and let C de- 
note the class group C(D). Let H Cc (Z/DZ)* be the subgroup of 
values represented by the principal form. 


(a) Show that genera can be classified by cosets of H in (Z/DZ)*. 
Thus, instead of the map ® of (3.12), we can use the map 


6' : C—(Z/DZ)*/H, 


so that ker(®’) is the principal genus and ®’(C) is the set of gen- 
era. Note that this argument does not use quadratic reciprocity. 


(b) Since H contains all squares in (Z/DZ)*, it follows that C? Cc 
ker(®’). Now adapt the proof of Theorem 3.15 to show that 


the number of genera is <[C : C*] = 247}, 


where the last equality follows from Proposition 3.11. This 
proves (*). 


In this exercise we will show that quadratic reciprocity follows from 
statement (*) of Exercise 3.12. As we saw in 81, it suffices to show 


(E) =m (2) 


where p and q are distinct odd primes and p* = (—1)?~)/2p. 
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3.14. 


3.15. 


3.16. 


3.17. 
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(a) Show that Lemma 3.17 holds for all nonzero discriminants D, so 
that we can use the assigned characters to distinguish genera. 

(b) Assume that (p*/q) = 1. Applying Lemma 2.5 with D = p*, q is 
represented by a form f(x,y) of discriminant p*. The number 
jt from Proposition 3.11 is 1, so that by (*), there is only one 
genus. Hence the assigned character (there is only one in this 
case) must equal 1 on any number represented by f(x,y), in 
particular g. Use this to prove that (q/p) = 1. This proves that 
(p*/4) = 1=> (@/p) = 1. 

(c) Next, assume that (q/p) = 1 and that either p = 1 mod 4 or g= 
1 mod 4. Use part (b) to show that (p*/q) = 1. 

(d) Finally, assume that (q¢/p)=1 and that p = q =3 mod 4. This 
time we will consider forms of discriminant pq. Proposition 
3.11 shows that = 2, so that by (*), there are at most two 
genera. Furthermore, the assigned characters are y;(a) = (a/p) 
and y2(a)=(a/q). Now consider the form f(x,y) = px*+ 
pxy + ((p—q)/4)y’, which is easily seen to have discriminant 
pq. Letting (x,y) = (0,4), it represents p —q. Use this to com- 
pute the complete character of the forms f(x,y) and —f(x,y), 
and show that one of these must lie in the principal genus since 
there are at most two genera. Then show that (— p/q) = 1. Note 
that parts (c) and (d) imply that (¢/p) = 1=> (p*/q) = 1, which 
completes the proof of quadratic reciprocity. 

(e) Gauss also used (*) to show that (2/p) = (—1)@’-)/8, Adapt 
the argument given above to prove this. Hint: when p= 
3,5 mod 8, show that p is properly represented by a form of 
discriminant 8. When p=1mod 8, note that the form 2x? + 
xy +((1— p)/8)y” has discriminant p and represents 2, and the 
argument is similar when p = 7 mod 8. 


Use Gauss’ definition of genus to divide the forms of discriminant 
—164 into genera. Hint: the forms are given in (2.33). Notice that 
this is much easier than working with our original definition! 


Prove the implications (vi) => (ili) > (ii) > (i) and (vi) > (v) => (i) 
of Theorem 3.21. 


Prove that the forms x* + 18y? and 2x? + 9y? are rationally equiva- 
lent but belong to different genera. Hint: if they represent the same 
values in (Z/72Z)*, then the same is true for any divisor of 72. 


Let D be a field discriminant, i.e. D = 1mod 4, D squarefree, or 
D =4k, k £1mod 4, k squarefree. Let f(x,y) and g(x,y) be two 
forms of discriminant D which are rationally equivalent. We want to 
prove that they lie in the same genus. 


3.18. 


3.20. 


E. EXERCISES 71 


(a) Let m be prime to D and represented by g(x,y). Show that 
f (x,y) represents d?m for some nonzero integer d. 

(b) Show that f(x,y) and g(x,y) lie in the same genus. Hint: by 
Exercise 2.1, f(x,y) properly represents m’ where d’ 2m' = d?m 
for some integer d'. Show that m' is relatively prime to D. To 
do this, use Lemma 2.3 to write f(x,y) = m'x? + bxy +cy?. 


When D = —4n is a field discriminant, we can use Theorem 3.21 

to give a different proof that every form in the principal genus is a 

square (this is part (ii) of Theorem 3.15). Let f(x,y) be a form of 

discriminant —4n which lies in the principal genus. 

(a) Show that f(x,y) properly represents a number of the form a”, 
where a is odd and relatively prime to n. Hint: use part (v) of 
Theorem 3.21. 

(b) By (a), we may assume that f(x,y) = ax? + 2bxy + cy?. Show 
that gcd(a,2b) = 1, and conclude that g(x,y) = ax? + 2bxy+ 
acy” has relatively prime coefficients and discriminant —4n. 

(c) Show that f(x,y) is the Dirichlet composition of g(x,y) with 
itself. 

This argument is due to Arndt (see Smith [95, pp. 254-256]), 
though Arndt proved (a) using the theorem of Legendre discussed 
in Exercise 2.24. Note that (a) can be restated in terms of ternary 
forms: if f(x,y) is in the principal genus, then (a) proves that the 
ternary form f(x,y)—2z* has a nontrivial zero. This result shows 
that there is a connection between ternary forms and genus theory. 
It is therefore not surprising that Gauss used ternary forms in his 
proof of Theorem 3.15. 


Let C(D) be the class group of forms of discriminant D < 0. Prove 
that the following statements are equivalent: 
(i) Every genus of discriminant D consists of a single class. 
(ii) C(D) ~ {+1}4#71, where p is as in Proposition 3.11. 
(iii) Every genus of discriminant D consists of equivalent forms. 
In this exercise we will prove Lemma 3.25. Let m > 0 be odd and 
prime to n > 1. 
(a) Show that the number of solutions modulo m of the congruence 
2 


x*=—nmodm 
is given by the formula 


M(+(>)) 


p|m 
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(b) Consider forms g(x, y) of discriminant —4n of the form 
g(x,y) = mx? + 2bxy + cy’, O<b<m. 


Show that the map sending g(x,y) to [b] € (Z/mZ)* induces a 
bijection between the g(x,y)’s and the solutions modulo m of 
x? =—-nmodm. 
(c) Let f(x,y) have discriminant —4n and let f(u,v)=m be a 
proper representation. Pick ro, so so that usg — vro = 1, and set 
=1rot+uk, s =s9+ vk. Note that as k € Z varies, we get all 
solutions of us — vr = 1. Then set 


g(x,y) =f(uxt+ry,vx+sy) 


and show that there is a unique k € Z such that g(x,y) satisfies 
the condition of (b). This form is denoted g, (x,y). 

(d) Show that the map sending a proper representation f(u,v) =m 
to the form g,,(x,y) is onto. 


(ec) If Bu'y'(X,y) = 8ur(X,y), let 


i OO 

y 6) \r st ros) 
Show that f(ax + BY, yx + dy) = f(x,y) and, since n> 1, show 
that (9 2) =+(5{)- Hint: assume that f(x,y) is reduced, and 
use the arguments from the uniqueness part of the proof of The- 
orem 2.8. 

(f) Conclude that gy y(x,y) = guy(x,y) if and only if (w',v’)= 
+(u,v), so that the map of (d) is exactly two-to-one. Combining 
this with (a) and (b), we get a proof of Lemma 3.25. 


3.21. This exercise will use Lemma 3.25 to study the equation m> = a? + 


2b. 

(a) If m is odd, use Lemma 3.25 to show that the equations m = 
x? +2y* and m* = x? +2y? have the same number of proper 
solutions. 

(b) If m = a? + 2b? is a proper representation, then show that 


m? = (a° — 6ab’)’ + 2(3a"b — 2b) 


is a proper representation. 
(c) Show that the map sending (a,b) to (a? — 6ab’,3a*b — 2b°) is 
injective. Hint: note that 


(a + b\/—2)° = (a — 6ab) + (3a*b — 2b*)/—2. 


3.22. 


3.23. 


3.24. 
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(d) Combine (a) and (c) to show that all proper representations of 
m?> = x? +2y?, m odd, arise from (b). 


Use Exercise 3.21 to prove Fermat’s famous result that (x,y) = 
(3,45) are the only integral solutions of the equation x? = y” + 2. 
Hint: first show that x must be odd, and then apply Exercise 3.21 
to the proper representation x* = y? + 2-1°. It’s likely that Fermat’s 
original proof of this result was similar to the argument presented 
here, though he would have used a version of Lemma 1.4 to prove 
part (c) of Exercise 3.21. See Weil [106, pp. 68—69 and 71-73] for 
more details. 


Let p be an odd prime of the form x? +ny*, n>1. Use Lemma 
3.25 to show that the equation 


p=x?+ny* 


has a unique solution once we require x and y to be nonnegative. 
Note also that Lemma 3.25 gives a very quick proof of Exercise 2.27. 


This exercise will examine a generalization of the Jacobi symbol. Let 
P and Q be relatively prime nonzero integers, where Q is odd but 
possibly negative. Then define the extended Jacobi symbol] (P/Q) 


via 
(=) _f @AQ|) when |Q|>1 
Q 1 when |Q| = 1. 


(a) Prove that when P and Q are odd and relatively prime, then 


(4) (2) = (1) “11O-19/4+(60n(P)-1)(6e0(0)-1)/ 
P 


where sen(P) = P/|P|. 

(b) Gauss’ version of (a) is more complicated to state. First, given P 
and Q as above, he lets p denote the number of prime factors of 
Q (counted with multiplicity) for which P is not a quadratic 
residue. This relates to (P/Q) by the formula 


(= 


Interchanging P and Q, we get a similarly defined number q. 
To relate the parity of p and q, Gauss states a rule in [41, 
8133] which breaks up into 10 separate cases. Verify that the 
rule proved in (a) covers all 10 of Gauss’ cases. 
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(c) Prove the supplementary laws: 
= : 
(=) = sgn(P)(—1)?-D/? 


(5) cs (-1)?-/8, 


3.25. Let p =1 mod 8 be prime. 


(a) If C(—4p) is the class group of forms of discriminant —4p, then 
use genus theory to prove that 


C(—4p) © (Z/2°Z) x G 


where a > 1 and G has odd order. Thus 2 | h(—4p). 
(b) Let f(x,y) = 2x? +2xy +((p + 1)/2)y*. Use Gauss’s definition 
of genus to show that f(x,y) is in the principal genus. 


(c) Use Theorem 3.15 to show that C(—4p) has an element of order 
4. Thus 4| h(—4p). 


§4. CUBIC AND BIQUADRATIC RECIPROCITY 


In this section we will study cubic and biquadratic reciprocity and use them 
to prove Euler’s conjectures for p = x*+27y* and p= x* + 64y? (see 
(1.22) and (1.23)). An interesting feature of these reciprocity theorems is 
that each one requires that we extend the notion of integer: for cubic reci- 
procity we will use the ring 


(4.1) Z[w)={at+bw:a,beZ}, w=e™F= (-1+ V-3)/2, 
and for biquadratic reciprocity we will use the Gaussian integers 
(4.2) Z[i] = {a+bi:a,beZ}, i= VJ-1. 


Both Z[w] and Z[7] are subrings of the complex numbers (see Exercise 4.1). 
Our first task will be to describe the arithmetic properties of these rings 
and determine their units and primes. We will then define the generalized 
Legendre symbols (a/7)3 and (a/7)4 and state the laws of cubic and bi- 
quadratic reciprocity. The proofs will be omitted since excellent proofs are 
already available in print (see especially Ireland and Rosen [59, Chapter 
9}). At the end of the section we will discuss Gauss’ work on reciprocity 
and say a few words about the origins of class field theory. 


A. Z[w] AND CUBIC RECIPROCITY 75 


A. Z[w] and Cubic Reciprocity 


The law of cubic reciprocity is intimately bound up with the ring Z[w] of 
(4.1). The main too] used to study the arithmetic of Z[w] is the norm func- 
tion: if a = a + bw is in Z[w], then its norm N(a) is the positive integer 


N(a) = aa = a? —ab +b’, 


where @ is the complex conjugate of a (in Exercise 4.1 we will see that 
@ € Z[w]). Note that the norm is multiplicative, ie., for a, 8 € Z[w], we have 


N(aB) = N(a)N(f) 


(see Exercise 4.2). Using the norm, one can prove that Z[w] is a Euclidean 
ring: 


Proposition 4.3. Given a,G € Z[w], B #0, there are 7,6 € Z[w] such that 
a= 7B+6 and N(6) < N(). 


Thus Z[w] is a Euclidean ring. 


Proof. Note that the norm function N(a@) = a@ is defined on Q(w) = {r + 
sw:r,s € Q} and satisfies N(uv) = N(u)N(v) for u,v € Q(w) (see Exercise 
4.2). Then 

a of ap 

Bot aa E Q&), 

ppp Np) 
so that a/B =r+sw for some r,s € Q. Let 71,51 be integers such that 
lp —11| < 1/2 and |s — 5;| < 1/2, and then set y = 71 +5,w and 6=a—- yf. 
Note that 7,6 € Z[w] and a = 78 + 6. It remains to show that N(6) < N(f). 
To see this, let € = a/f8 —y =(r—1r1)+(s —51)w, and note that 


6 = a— 48 = pla/p—7)= 
Since the norm is multiplicative, it suffices to prove that N(e) < 1. But 
N(e) = N((r —11) + (s — 51 w) = (r— 1)? — (r — (8 — 51) + (5 — 51)’, 


and the desired inequality follows from |r — r|,|s — 51| < 1/2. By the stan- 
dard definition of a Euclidean ring (see, for example, Herstein [54, §3.7]), 
we are done. Q.E.D. 


Corollary 4.4. Z[w] is a PID (principal ideal domain) and a UFD (unique 
factorization domain). 


Proof. It is well known that any Euclidean ring is a PID and a UFD—see, 
for example, Herstein [54, Theorems 3.7.1 and 3.7.2]. Q.E.D. 
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For completeness, let’s recall the definitions of PID and UFD. Let R 
be an integral domain. An ideal of R is principal if it can be written in 
the form aR = {a8: 8 € R} for some a € R, and R is a PID if every ideal 
of R is principal. To explain what a UFD is, we first need to define units, 
associates and irreducibles: 

(i) aE Risaunit if aB = 1 for some PE R. 
(ii) a,8€R are associates if a is a unit times B. This is equivalent to 


aR = pR. 
(iii) A nonunit a € R is irreducible if a = By in R implies that 6 or y is a 
unit. 


Then R is a UFD if every nonunit a 4 0 can be written as a product of 
irreducibles, and given two such factorizations of a, each irreducible in the 
first factorization can be matched up in an one-to-one manner with an as- 
sociate irreducible in the second. Thus factorization is unique up to order 
and associates. 

It turns out that being a PID is the stronger property: every PID is a 
UFD (see Ireland and Rosen [59, §1.3]), but the converse is not true (see 
Exercise 4.3). Given an element a # 0 ina PID R, the following statements 
are equivalent: 

(i) @ is irreducible. 
(ii) a is prime (an element a of R is prime if a | By implies a| 6 or a| 7). 
(iii) @R is a prime ideal (an ideal p of R is prime if By € p implies 6 € p 
or 7 € p). 
(iv) aR is a maximal ideal. 


(See Exercise 4.4 for the proof.) 


Since Z[w] is a PID and a UFD, the next step is to determine the units 
and primes of Z[w]. Let’s start with the units: 


Lemma 4.5. 
(i) An element a € Z[w] is a unit if and only if N(a) = 1. 
(ii) The units of Z[w] are Z[w]* = {+1,+w, +w*}. 
Proof. See Exercise 4.5. Q.E.D. 


The next step is to describe the primes of Z[w]. The following lemma 
will be useful: 


Lemma 4.6. If a € Z[w] and N(a@) ts a prime in Z, then a ts prime in Z[w]. 
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Proof. Since Z[w] is a PID, it suffices to prove that a@ is irreducible. So 
suppose that a = By in Z[w]. Taking norms, we obtain the integer equation 


N(a@) = N(B7) = N(B)N(Y) 


(recall that the norm is multiplicative). Since N(q@) is prime by assumption, 
this implies that N(f) or N(¥) is 1, so that 8 or 7 is a unit by Lemma 
4.5. Q.E.D. 


We can now determine all primes in Z[w]: 


Proposition 4.7. Let p bea prime in Z. Then: 
(i) If p = 3, then 1—w is prime in Z[w] and 3 = —w?(1—w)?. 
(ii) Jf p =1 mod 3, then there is a prime 7 € Z[w] such that p= 17, and 
the primes ™ and 7% are nonassociate in Z[w]. 
(iii) If p =2 mod 3, then p remains prime in Z[w}. 
Furthermore, every prime in Z[w] is associate to one of the primes listed in 
(i)-(ui) above. 


Proof. Since N(1—w) = 3, Lemma 4.6 implies that 1—w is prime in Z[w], 
and (i) follows. To prove (ii), suppose that p = 1 mod 3. Then (—3/p) = 1, 
so that p is represented by a reduced form of discriminant —3 (this is The- 
orem 2.16). The only such form is x? + xy + y?, so that p can be writ- 
ten as a?—ab+b*. Then 7 =a + bw and 7 =a + bu* have norms N(z) = 
N(m) = p and hence are prime in Z[w] by Lemma 4.6. In Exercise 4.7 we 
will prove that 7 and 7 are nonassociate. The proof of (iii) is left to the 
reader (see Exercise 4.7). 

It remains to show that all primes in Z[w] are associate to one of the 
above. Let’s temporarily call the primes given in (i)-(ili) the known primes 
of Z[w], and let a be any prime of Z[w]. Then N(a@) = a@ is an ordinary 
integer and may be factored into integer primes. But (i)-(iii) imply that 
any integer prime is a product of known primes in Z[w], and consequently 
aa = N(q@) is also a product of known primes. The proposition then follows 
since Z[w] is a UFD. Q.E.D. 


Given a prime 7 of Z[w], we get the maximal ideal mZ[w] of Z[w]. The 
quotient ring Z[w]/aZ[w] is a thus a field. We can describe this field more 
carefully as follows: 


Lemma 4.8. If 7 is a prime of Z[w], then the quotient field Z[w]/nZ[w] is 
a finite field with N(x) elements. Furthermore, N(m)= p or p* for some 
integer prime p, and: 
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(i) If p =3 or p=1 mod 3, then N(a) = p and Z/ pZ ~ Z[w]/aZ[w]. 
(ii) If p =2 mod 3, then N(m) = p* and Z/pZ is the unique subfield of or- 
der p of the field Z[w]/xZ[w] of p* elements. 


Proof. In §7 we will prove the that if 7 is a nonzero element of Z[w], then 
Z[w]/mZ[w] is a finite ring with N(7) elements (see Lemma 7.14 or Ireland 
and Rosen [59, §§9.2 and 14.1]). Then (i) and (ii) follow easily (see Exercise 
4.8). Q.E.D. 


Given a, 6 and a in Z[w], we will write a = 8 mod 7 to indicate that 
a and £ differ by a multiple of 7, i-e., that they give the same element in 
Z[w]/aZ[w]. Using this notation, Lemma 4.8 gives us the following analog 
of Fermat’s Little Theorem: 


Corollary 4.9. If 7 is prime in Z[w] and doesn’t divide a € Z[w], then 


aN(™-1 = 4 mod a. 


Proof. This follows because (Z[w]/mZ[w])* is a finite group with N(7)—1 
elements. Q.E.D. 


Given these properties of Z[w], we can now define the generalized Leg- 
endre symbol (a/7)3. Let be a prime of Z[w] not dividing 3 (i.e., not 
associate to 1—w). It is straightforward to check that 3| N(7)-— 1 (see Ex- 
ercise 4.9). Now suppose that a € Z[w] is not divisible by 7. It follows from 
Corollary 4.9 that x = aW(—-/ is a root of x? = 1 mod 7. Since 


x? —1=(x—1)(x —w)(x —w’) mod 
and 7 is prime, it follows that 
aN MDP = 1 w,w? mod t. 


However, the cube roots of unity 1,w,w* are incongruent modulo 7. To see 
this, note that if any two were congruent, then we would have 1=w mod7, 
which would contradict 7 not associate to 1—w (see Exercise 4.9 for the 
details). Then we define the Legendre symbol (a/7)3 to be the unique cube 
root of unity such that 


(4.10) o(M@)-D/3 = (2); ee 


The basic properties of the Legendre symbol are easy to work out. First, 
from (4.10), one can show 


(2),-(2 
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and second, a = 8 mod a implies that 


(see Exercise 4.10). The Legendre symbol may thus be regarded as a group 
homomorphism from (Z[w]/aZ[w])* to C*. 

An important fact is that the multiplicative group of any finite field is 
cyclic (see Ireland and Rosen [59, §7.1]). In particular, (Z[w]/aZ[w])* is 
cyclic, which implies that 


(2); = 1 aM = 1 mod a 
(4.11) 1 


<=> x3=amod7 has a solution in Z[w] 


(see Exercise 4.11). This establishes the link between the Legendre symbol 
and cubic residues. Note that one-third of (Z[w]/aZ[w])* consists of cubic 
residues (where the Legendre symbol equals 1), and the remaining two- 
thirds consist of nonresidues (where the symbol equals w or w?). Later on 
we will explain how this relates to the more elementary notion of cubic 
residues of integers. 

To state the Jaw of cubic reciprocity, we need one final definition: a 
prime m is called primary if m = +1 mod 3. Given any prime 7 not dividing 
3, one can show that exactly two of the six associates +7, +wa and twa 
are primary (see Exercise 4.12). Then the law of cubic reciprocity states the 
following: 


Theorem 4.12. [f 7 and 0 are primary primes in Z[w] of unequal norm, then 


()> (o)» 


Proof. See Ireland and Rosen [59, §§9.4-9.5] or Smith [95, pp. 89-91]. 
Q.E.D. 


Notice how simple the statement of the theorem is—it’s among the most 
elegant of al] reciprocity theorems (biquadratic reciprocity, to be stated be- 
low, is a bit more complicated). The restriction to primary primes is a nor- 
malization analogous to the normalization p > 0 that we make for ordinary 
primes. Some books (such as Ireland and Rosen [59]) define primary to 
mean 7 =—1 mod 3. Since (—1/7)3 = 1, this doesn’t affect the statement 
of cubic reciprocity. 

There are also supplementary formulas for (w/7)3 and (1—w/7)3. Let 7 
be prime and not associate to 1— w. Then we may assume that 7 = —1 mod 
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3 (if m is primary, one of +7 satisfies this condition). Writing 7 = —1+ 
3m + 3nw, it can be shown that 


1l-—w em 
( : je | 


The first line of (4.13) is easy to prove (see Exercise 4.13), while the second 
is more difficult (see Ireland and Rosen [59, p. 114] or Exercise 9.13). 

Let’s next discuss cubic residues of integers. If p is a prime, the basic 
question is: when does x* =a mod p have an integer solution? If p = 3, 
then Fermat’s Little Theorem tells us that a* =a mod 3 for all a, so that 
we always have a solution. If p =2 mod 3, then the map a+ a? induces 
an automorphism of (Z/pZ)* since 3/ p — 1 (see Exercise 4.14), and con- 
sequently x? = a mod p is again always solvable. If p = 1 mod 3, things are 
more interesting. In this case, p = m7 in Z[w], and there is a natural iso- 
morphism Z/pZ ~ Z[w]/mZ[w] by Lemma 4.8. Thus, for p} a, (4.11) implies 
that 


(4.13) 


os a 
(4.14) =a mod p is solvable in Z <> (£), = 1. 


Furthermore, (Z/pZ)* breaks up into three pieces of equal size, one of 
cubic residues and two of nonresidues. 

We can now use cubic reciprocity to prove Euler’s conjecture for primes 
of the form x? + 27y?: 


Theorem 4.15. Let p be a prime. Then p = x? +27Ty? if and only if p= 
1 mod 3 and 2 ts a cubic residue modulo p. 


Proof. First, suppose that p = x* +27y’. This clearly implies that p =1 
mod3, so that we need only show that 2 is a cubic residue modulo p. Let 
T =x +3/—3y, so that p = 77 in Z[w]. It follows that a is prime, and then 
by (4.14), 2 is a cubic residue modulo p if and only if (2/7)3 = 1. However, 
both 2 and a = x + 3./—3y are primary primes, so that cubic reciprocity 
implies 


ws + (9. 


It thus suffices to prove that (7/2); = 1. However, from (4.10), we know 
that 


(4.17) (5): =n mod 2 
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since (N(2) — 1)/3 = 1. So we need only show that 7 = 1 mod 2. Since /—3 
=1+2v,7=x+3/—3y =x + 3y + 6yw, so that 7=x+3y =x+y mod 
2. But x and y must have opposite parity since p = x* + 27y, and we are 
done. 

Conversely, suppose that p = 1 mod 3 is prime and 2 is a cubic residue 
modulo p. We can write p as p=77, and we can assume that 7m is a 
primary prime in Z[w]. This means that 7 = a + 3bw for some integers a 
and b. Thus 


4p = Ant = 4(a? — 3ab + 9b*) = (2a — 3b)? + 27D’. 


Once we show 5 is even, it will follow immediately that p is of the form 
2 2 
AO 21y" 

We now can use our assumption that 2 is a cubic residue modulo p. 
From (4.14) we know that (2/7)3 = 1, and then cubic reciprocity (4.16) tells 
us that (7/2); = 1. But by (4.17), this implies 7 = 1 mod 2, which we can 
write as a + 3bw = 1 mod 2. This easily implies that a is odd and b is even, 
and p = x? +27y? follows. The theorem is proved. Q.E.D. 


B. Z[i] and Biquadratic Reciprocity 

Our treatment of biquadratic reciprocity will be brief since the basic ideas 
are similar to what we did for cubic residues (for a complete discussion, see 
Ireland and Rosen [59, §§9.7—9.9]). Here, the appropriate ring is the ring of 
Gaussian integers Z[i] as defined in (4.2). The norm function N(a + bi) = 
a? + b* makes Z[i] into a Euclidean ring, and hence Z[i] is also a PID and 
a UFD. The analogs of Lemma 4.5 and 4.6 hold for Z[z], and it is easy to 
check that its units are +1 and +1 (see Exercise 4.16). The primes of Z[7] 
are described as follows: 


Proposition 4.18. Let p bea prime in Z. Then: 
(i) If p = 2, then 1 +1 is prime in Z[i] and 2 = 7(1 +i). 
(ii) If p = 1 mod 4, then there is a prime 7 € Z[i] such that p = 17, and the 
primes ™ and 7% are nonassociate in Z{1]. 
(iii) Jf p = 3 mod 4, then p remains prime in Z{i}. 
Furthermore, every prime in Z[i] is associate to one of the primes listed in 
(i)-(i) above. 


Proof. See Exercise 4.16. Q.E.D. 


We also have the following version of Fermat’s Little Theorem: if 7 is 
prime in Z[i] and doesn’t divide a € Z[z], then 


(4.19) aN(™-1 = 1 mod r 
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(see Exercise 4.16). 

These basic facts about the Gaussian integers appear in many elemen- 
tary texts (e.g., Herstein [54, §3.8]), but such books rarely mention that the 
whole reason Gauss introduced the Gaussian integers was so that he could 
state biquadratic reciprocity. We will have more to say about this later. 

We can now define the Legendre symbol (a/7)4. Given a prime 7 of 
Z[i] not associate to 1+7, it can be proved that +1,+/ are distinct modulo 
m and that 4| N(7)-— 1 (see Exercise 4.17). Then, for a not divisible by 7, 
the Legendre symbol (a/m)4 is defined to be the unique fourth root of unity 
such that 


(4.20) i es ae (2), mod 7. 
1 
As in the cubic case, we see that 
(=). =1 <> x4=a mod 7 is solvable in Z{i], 


and furthermore, the Legendre symbol gives a character from (Z[1]/aZ[z])* 
to C*, so that (Z[i]/mZ[i])* is divided into four equal parts (see Exercise 
4.18). When p = 1 mod 4, we have (Z[!]/mZ[z])* ~ (Z/pZ)*, and the parti- 
tion can be described as follows: one part consists of biquadratic residues 
(where the symbol] equals 1), another consists of quadratic residues which 
aren’t biquadratic residues (where the symbol equals —1), and the final two 
parts consist of quadratic nonresidues (where the symbo] equals +7)—see 
Exercise 4.19. 

A prime a of Z[i] is primary if 7 = 1 mod 2 +21. Any prime not asso- 
ciate to 1+7 has a unique associate which is primary (see Exercise 4.21). 
With this normalization, the law of biquadratic reciprocity can be stated as 
follows: 


Theorem 4.21. If a and @ are distinct primary primes in Z[{1}, then 


= (§) N(0)—1)(N(m)—1)/16 
2), = (2), (-1neo@-pem-v/rs, 
T 4 Q 4( 


Proof. See Ireland and Rosen [59, §9.9] or Smith [95, pp. 76-87]. Q.E.D. 


There are also supplementary laws which state that 


tT) _j-@-np 
T 4 


(4.22) . 
(+ + -) — ;(a-b—-1-b)/4 
<A 
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where 7 = a+b! is a primary prime. As in the cubic case, the first line of 
(4.22) is easy to prove (see Exercise 4.22), while the second is more difficult 
(see Ireland and Rosen [59, Exercises 32-37, p. 136]). 

We can now prove Euler’s conjecture about p = x* + 64y’: 


Theorem 4.23. 
(i) [ft =a+ bi is a primary prime in Z{i], then 


2 ree 
(=), = jab) 


(ii) If p is prime, then p = x° + 64y? if and only if p=1 mod 4 and 2 is a 
biquadratic residue modulo p. 


Proof. First note that (1) implies (11). To see this, let p = 1 mod 4 be prime. 
We can write p = a* + b* = x7, where 7 = a + bi is primary. Note that a is 
odd and 6b is even. Since Z/ pZ ~ Z[t]/mZ[t], (1) shows that 2 is a biquadratic 
residue modulo p if and only if 6 is divisible by 8, and (ii) follows easily. 
One way to prove (i) is via the supplementary laws (4.22) since 2 = 
P(1+i)° (see Exercise 4.23). However, in 1857, Dirichlet found a proof 
of (i) that uses only quadratic reciprocity [27, Vol. II, pp. 261-262]. A ver- 
sion of this proof is given in Exercise 4.24 (see also Ireland and Rosen [59, 
Exercises 26-28, p. 64]). Q.E.D. 


C. Gauss and Higher Reciprocity 


Most of the above theorems were discovered by Gauss in the period 1805- 
1814, though the bulk of what he knew was never published. Only in 1828 
and 1832, long after the research was completed, did Gauss publish his 
two memoirs on biquadratic residues [42, Vol. II, pp. 65-148] (see also [41, 
pp. 511-586, German editions] for a German translation). The first memoir 
treats the elementary theory of biquadratic residues of integers, and it in- 
cludes a proof of Euler’s conjecture for x? + 64y”. In the second memoir, 
Gauss begins with a careful discussion of the Gaussian integers, and he ex- 
plains their relevance to biquadratic reciprocity as follows [42, Vol. II, §30, 
p. 102]: 


the theorems on biquadratic residues gleam with the greatest sim- 
plicity and genuine beauty only when the field of arithmetic is ex- 
tended to imaginary numbers, so that without restriction, the num- 
bers of the form a + bi constitute the object [of study], where as usual 
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i denotes —1 and the indeterminates a, b denote integral real num- 
bers between —oo and +oo. We will call such numbers integral com- 
plex numbers (numeros integros complexos) ... 


Gauss’ treatment of Z[i] includes most of what we did above, and in partic- 
ular the terms norm, associate and primary are due to Gauss. 

Gauss’ statment of biquadratic reciprocity differs slightly from Theorem 
4.21. In terms of the Legendre symbol, his version goes as follows: given 
distinct primary primes 7 and @ of Z[/], 


If either a or 6 is congruent to 1 modulo 4, then (1/0), = (0/7). 
If both a and @ are congruent to 3 + 27 modulo 4, then (17/@)4 = —(6/7)4. 


In Exercise 4.25 we will see that this is equivalent to Theorem 4.21. As 
might be expected, Gauss doesn’t use the Legendre symbol in his mem- 
oir. Rather, he defines the biquadratic character of a with respect to 7 
to be the number X € {0,1,2,3} satisfying a\(™-)/4 = 74 mod a (so that 
(a/m)4 =i*), and he states biquadratic reciprocity using the biquadratic 
character. For Gauss, this theorem is “the Fundamental Theorem of bi- 
quadratic residues” [42, Vol. II, 867, p. 138], but instead of giving a proof, 
Gauss comments that 


In spite of the great simplicity of this theorem, the proof belongs to 
the most hidden mysteries of higher arithmetic, and at least as things 
now stand, [the proof] can be explained only by the most subtle inves- 
tigations, which would greatly exceed the limits of the present memoir. 


Later on, we will have more to say about Gauss’ proof. 
In the second memoir, Gauss also makes his only published reference to 
cubic reciprocity [42, Vol. II, §30, p. 102]: 


The theory of cubic residues must be based in a similar way on a 
consideration of numbers of the form a + bh, where h is an imaginary 
root of the equation h? — 1 = 0, say h = (-1+ /—3)/2, and similarly 
the theory of residues of higher powers leads to the introduction of 
other imaginary quantities. 


So Gauss was clearly aware of the properties of Z[w], even if he never made 
them public. 

Turning to Gauss’ unpublished material, we find that one of the earliest 
fragments on higher reciprocity, dated around 1805, is the following “Beau- 
tiful Observation Made By Induction” [42, Vol. VIII, pp. 5 and 11]: 
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2 is a cubic residue or nonresidue of a prime number p of the 
form 3n +1, according to whether p is representable by the form 
xx +27yy or 4xx+2xy + 7Tyy. 


This shows that Euler’s conjecture for x? + 27y? was one of Gauss’ starting 
points. And notice that Gauss was aware that he was separating forms in 
the same genus—the very problem we discussed in §3. 

Around the same time, Gauss also rediscovered Euler’s conjecture for 
x? + 64y? [42, Vol. X.1, p. 37]. But how did he come to make these conjec- 
tures? There are two aspects of Gauss’ work that bear on this question. The 
first has to do with quadratic forms. Let’s follow the treatment in Gauss’ 
first memoir on biquadratic residues [42, Vol. II, §§12-14, pp. 75-78]. Let 
p=1mod 4 be prime. If 2 is to be a biquadratic residue modulo p, it fol- 
lows by quadratic reciprocity that p = 1 mod 8 (see Exercise 4.26). By Fer- 
mat’s theorem for x* + 2y?, p can be written as p = a* + 2b’, and Gauss 
proves the lovely result that 2 is a biquadratic residue modulo p if and only 
if a= +1 mod 8 (see Exercise 4.27). This is nice, but Gauss isn’t satisfied: 


Since the decomposition of the number p into a single and dou- 
ble square is bound up so prominently with the classification of the 
number 2, it would be worth the effort to understand whether the de- 
composition into two squares, to which the number p is equally liable, 
perhaps promises a similar success. 


Gauss then computes some numerical examples, and they show that when 
p is written as a? + b’, 2 is a biquadratic residue exactly when b is divisible 
by 8. This could be how Gauss was led to the conjecture in the first place, 
and the same thing could have happened in the cubic case, where primes 
p =1mod 3 can be written as a? + 3b’. 

The cubic case most likely came first, for it turns out that Gauss de- 
scribes a relation between x? + 27y* and cubic residues in the last section 
of Disquisitiones. This is where Gauss discusses the cyclotomic equation 
x? —1=0 and proves his celebrated theorem on the constructibility of reg- 
ular polygons. To see what this has to do with cubic residues, let’s describe 
a little of what he does. Given an odd prime p, let ¢, = e?™/? be a prim- 
itive pth root of unity, and let g be a primitive root modulo p, i.e., g is 
an integer such that [g] generates the cyclic group (Z/pZ)*. Now suppose 
that p— 1= ef, and let A be an integer. Gauss then defines [41, §343] the 
period (f,) to be the sum 


fal 
A= Sone". 
j=0 
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These periods are the key to Gauss’ study of the cyclotomic field Q(¢,). In 
fact, if we fix f, then the periods (f,1), (f,g), (/,27),---.(f,g° +) are the 
roots of an irreducible integer polynomial of degree e, and consequently 
these periods are primitive elements of the unique subfield Q C K C Q(Cp) 
of degree e over Q. 

When p = 1 mod 3, we can write p —1=3f, and then the three above 
periods are (f,1), (f,g) and (f,g7). Gauss studies this case in [41, §358], 
and by analyzing the products of the periods, he deduces the amazing result 
that 


(4.24) If 4p = a* + 27b? and a =1 mod 3, then N = p+a—2, where 
N is the number of solutions modulo p of x* — y>=1 mod p. 


To see how cubic residues enter into (4.24), note that N = 9M +6, where 
M is the number of nonzero cubic residues which, when increased by one, 
remain a nonzero cubic residue (see Exercise 4.29). Gauss conjectured this 
result in October 1796 and proved it in July 1797 [42, Vol. X.1, entries 39 
and 67, pp. 505-506 and 519]. So Gauss was aware of cubic residues and 
quadratic forms in 1796. Gauss’ proof of (4.24) is sketched in Exercise 4.29. 

Statement (4.24) is similar to the famous last entry in Gauss’ mathemat- 
ical diary. In this entry, Gauss gives the following analog of (4.24) for the 
decomposition p = a? + b* of a prime p = 1 mod 4: 


If p =a’ +b’ and a +bi is primary, then N = p — 2a —3, where 


N is the number of solutions modulo p of x* + y?+x*y* =1mod p 


(see [42, Vol. X.1, entry 146, pp. 571-572]). In general, the study of the 
solutions of equations modulo p Jeads to the zeta function of a variety over 
a finite field. For an introduction to this extremely rich topic, see Ireland 
and Rosen [59, Chapter 11]. In §14 we will see how Gauss’ results relate to 
elliptic curves with complex multiplication. 

Going back to the cubic case, there is a footnote in [41, §358] which gives 
another interesting property of the periods (f,1), (f,g) and (f,g7): 


(4.25) ((f,1) + w(f,g) + °(f,82)) = pla + bV—27)/2, 
where 4p = a? + 27b*. 


The right hand side is an integer in the ring Z[w], and one can show that 
m = (a + bV—27)/2 is a primary prime in Z[w] and that p= 77. This is 
how Gauss first encountered Z[w] in connection with cubic residues. No- 
tice also that if we set x(a) = (a/7)3 and pick the primitive root g so that 
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X(g) =w, then 


~1 


P 
(4.26) (f,1) +w(f.g)+w'(fg7) = > x(aycs. 


=1 


This is an example of what we now call a cubic Gauss sum. See Ireland and 
Rosen [59, §88.2-8.3] for the basic properties of Gauss sums and a modern 
treatment of (4.24) and (4.25). 

The above discussion shows that Gauss was aware of cubic residues and 
Z(w] when he made his “Beautiful Observation” of 1805, and it’s not sur- 
prising that two years later he was able to prove a version of cubic reci- 
procity [42, Vol. VIII, pp. 9-13]. The biquadratic case was harder, taking 
him until sometime around 1813 or 1814 to find a complete proof. We 
know this from a letter Gauss wrote Dirichlet in 1828, where Gauss men- 
tions that he has possessed a proof of the “Main Theorem” for around 14 
years [42, Vol. II, p. 516]. Exact dates are hard to come by, for most of 
the fragments Gauss left are undated, and it’s not easy to match them up 
with his diary entries. For a fuller discussion of Gauss’ work on biquadratic 
reciprocity, see Bachman [42, Vol. X.2.1, pp. 52-60] or Rieger [84]. 

Gauss’ proofs of cubic and biquadratic reciprocity probably used Gauss 
sums similar to (4.26), and many modern proofs run along the same lines 
(see Ireland and Rosen [59, Chapter 9]). Gauss sums were first used in 
Gauss’ sixth proof of quadratic reciprocity (see [42, Vol. II, pp. 55-59] or 
[41, pp. 501-505, German editions]). This is no accident, for as Gauss ex- 
plained in 1818: 


From 1805 onwards I have investigated the theory of cubic and bi- 
quadratic residues... Theorems were found by induction... which had 
a wonderful analogy with the theorems for quadratic residues. On the 
other hand, for a long time all attempts at complete proofs have been 
futile. This was the motive for endeavoring to add yet more proofs to 
those already known for quadratic residues, in the hope that of the 
many different methods given, one or the other would contribute to 
the illumination of the related arguments [for cubic and biquadratic 
residues]. This hope was in no way in vain, for at last tireless labor 
has led to favorable success. Soon the fruit of this vigilance will be 
permitted to come to public light... 


(see [42, Vol. II, p. 50] or [41, p. 497, German editions]). The irony is that 
Gauss never did publish his proofs, and it was left to Eisenstein and Jacobi 
to give us the first complete treatments of cubic and biquadratic reciprocity 
(see Collinson [22] or Smith [95, pp. 76-92] for more on the history of these 
reciprocity theorems). 
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We will conclude this section with some remarks about what happened 
after Gauss.. Number theory was becoming a much larger area of mathemat- 
ics, and the study of quadratic forms and reciprocity laws began to diverge. 
In the 1830s and 1840s, Dirichlet introduced L-series and began the analytic 
study of quadratic forms, and simultaneously, Eisenstein and Jacobi worked 
out not only cubic and biquadratic reciprocity, but also reciprocity for 5th, 
8th and 12th powers. Kummer was also studying higher reciprocity, and he 
introduced his “ideal numbers” to make up for the lack of unique factoriza- 
tion in Q(e?"//P). Both he and Eisenstein were able to prove generalized 
reciprocity laws using these “ideal numbers” (see Ireland and Rosen [59, 
Chapter 14] and Smith [95, pp. 93-126]). In 1871 Dedekind made the tran- 
sition from “ideal numbers” to ideals in rings of algebraic integers, thereby 
laying the foundation for modern algebraic number theory and class field 
theory. 

But reciprocity was not the only force leading to class field theory: there 
was also complex multiplication. Euler, Lagrange, Legendre and others 
studied transformations of the elliptic integrals 


/ dx 
J/(1— x?)(1— k? x?) 


and they discovered that certain values of k, called singular moduli, gave 
elliptic integrals that could be transformed into complex multiples of them- 
selves. This phenomenon came to be called complex multiplication. In work- 
ing with complex multiplication, Abel observed that singular moduli and 
the roots of the corresponding transformation equations have remarkable 
algebraic properties. In modern terms, they generate Abelian extensions of 
Q(/—n), i.e., Galois extensions of Q(,/—n) with Abelian Galois group. 
These topics will be discussed in more detail in Chapter Three. 

Kronecker extended and completed Abel’s work on complex multiplica- 
tion, and in so doing he made the amazing conjecture that every Abelian 
extension of Q(,/—7) lies in one of the fields described above. Kronecker 
had earlier conjectured that every Abelian extension of Q lies in one of 
the cyclotomic fields Q(e?"'/") (this is the famous Kronecker—-Weber theo- 
rem, to be proved in §8). Abelian extensions may seem far removed from 
reciprocity theorems, but Kronecker also noticed relations between singu- 
lar moduli and quadratic forms. For example, his results on complex mul- 
tiplication by “—31 led to the following corollary which he was fond of 
quoting: 


(x? — 10x)? + 31(x? — 1)? =0 mod p 


p=x'+3ly? <— { 
has an integral solution 
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(see [68, Vol. II, pp. 93 and 99-100, Vol. IV, pp. 123-129]). This is sim- 
ilar to what we just proved for x° +27y* and x* + 64y? using cubic and 
biquadratic reciprocity. So something interesting 1s going on here. 

We thus have two interrelated questions of interest: 

(i) Is there a general reciprocity law that subsumes the known ones? 
(i1) Is there a general method for describing all Abelian extensions of a 
number field? 

The crowning achievement of class field theory is that it solves both of 
these problems simultaneously: an Abelian extension L of a number field 
K is classified in terms of data intrinsic to K, and the key ingredient linking 
L to this data is the Artin reciprocity theorem. Complete statements of the 
theorems of class field theory will be given in Chapter Two, and in Chapter 
Three we will explain how complex multiplication is related to the class 
field theory of imaginary quadratic fields. 

For a fuller account of the history of class field theory, see the article 
by W. and E Ellison [32, 88IMI-IV] in Dieudonné’s Abrégé d’Histoire des 
Mathématiques 1700-1900. Weil has a nice discussion of reciprocity and cy- 
clotomic fields in [105] and [107], and Edwards describes Kummer'’s “ideal 
numbers” in [31, Chapter 4]. 


D. Exercises 


4.1. Prove that Z[w] and Z[7] are subrings of the complex numbers and 
are closed under complex conjugation. 


4.2. Let Q(w) = {r+sw:r,s € Q}, and define the norm of r+sw to be 
N(r + Sw) =(r+sw)(r + sw). 
(i) Show that N(r + sw) =r?—rs +s”. 
(il) Show that N(uv) = N(u)N(v) for u,v € Q(w). 
4.3. It is well-known that R = C[x,y] is a UFD (see Herstein [54, Corol- 


lary 2 to Theorem 3.11.1]). Prove that J = {f(x,y)¢€ R: f(0,0) = 0} 
is an ideal of R which is not principal, so that R is not a PID. Hint: 


weds 


4.4. Given a #0 ina PID R, prove that a is irreducible <= a is prime 
<= aR is a prime ideal <= a@R is a maximal ideal. 


4.5. Prove Lemma 4.5. Hint for (11): use (i) and (2.4). 


4.6. While Z[w] is a PID and a UFD, this exercise will show that the 
closely related ring Z[./—3] has neither property. 
(a) Show that +1 are the only units of Z[/—3]. 
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4.7. 


4.8. 


4.9. 


4.10. 


4.11. 


4.12. 


4.13. 


4.14. 


4.15. 
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(b) Show that 2, 1+ /—3 and 1—/-—3 are nonassociate and irre- 
ducible in Z[,/—3]. Since 4 = 2-2 = (1+ V—3)(1— V—3), these 
elements are not prime and thus Z[./—3] is not a UFD. 

(c) Show that the ideal in Z[,/—3] generated by 2 and 1+ /—3 is not 
principal. Thus Z[/—3] is not a PID. 


This exercise is concerned with the proof of Proposition 4.7. Let p be 

a prime number. 

(a) When p= 1 mod 3, we showed that p = 77, where 7 and 7 are 
prime in Z[w]. Prove that a and 7 are nonassociate in Z[w]. 

(b) When p= 2 mod 3, prove that p is prime in Z[w]. Hint: show 
that p is irreducible. Note that by Lemma 2.5, the equation p = 
N(q) has no solutions. 


Complete the proof of Lemma 4.8. 


Let 7 be a prime of Z[w] not associate to 1—w. 

(a) Show that 3| N(a)—- 1. 

(b) If any two of 1, w and w* are congruent modulo 7, then show that 
1=w mod 7, and explain why this contradicts our assumption on 
a. This proves that 1, w and w? are distinct modulo 7. 


Let m be prime in Z[w], and let a,f € Z[w] be not divisible by 7. 
Verify the following properties of the Legendre symbol. 

(a) (@P/)3 = (a/7)3(8/T)s. 

(b) (a/7)3 = (G/7)3 when a = 6 mod 7. 


Let m be prime in Z[w]. Assuming that (Z[w]/aZ[w])* is cyclic, prove 
(4.11). 


Let a be a prime of Z[w] which is not associate to 1—w. Prove that 
exactly two of the six associates of a are primary. 


Prove the top line of (4.13). 
Use the hints in the text to prove that the congruence x? = a mod p 
is always solvable when p is a prime congruent to 2 modulo 3. 


In this problem we will give an application of cubic reciprocity which 

is similar to Theorem 4.15. Let p = 1 mod 3 be a prime. 

(a) Use the proof of Theorem 4.15 to show that 4p can be written 
in the form 4p = a? + 27b?, where a = 1 mod 3. Conclude that 
m = (a +3./—3b)/2 isa primary prime of Z[w]and that p= 17. 


4.16. 


4.17. 


4.18. 


4.19. 
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(b) Show that the supplementary laws (4.13) can be written 


(*); = yatD/3 
TV 


(- — ), = yy(a+2)/3+b 
1 


where 7 IS as in part (a). 
(c) Use (b) to show that (3/7)3 =w?. 
(d) Use (c) and (4.14) to prove that for a prime p, 


; ; p=1mod 3 and 3 is a 
4Ap=x°+243y* <=> 
cubic residue modulo p. 
Euler conjectured the result of (d) (in a slightly different form) in 
his Tractatus [33, Vol. V, pp. XXII and 250). 


In this exercise we will discuss the properties of the Gaussian inte- 
gers Z[/]. 

(a) Use the norm function to prove that Z[/] is Euclidean. 

(b) Prove the analogs of Lemmas 4.5 and 4.6 for Z[z]. 

(c) Prove Proposition 4.18. 

(d) Formulate and prove the analog of Lemma 4.8 for Z[7]. 

(e) Prove (4.19). 

If m is a prime of Z[/] not associate to 1+ 7, show that 4| N(7)-1 
and that +1 and +/ are all distinct modulo 7. 


This exercise is devoted to the properties of the Legendre symbol 

(a/m)4, where 7 is prime in Z[/] and a is not divisible by 7. 

(a) Show that a(™—)/4 is congruent to a unique fourth root of 
unity modulo 7. This shows that the Legendre symbol, as given 
in (4.20), 1s well-defined. Hint: use Exercise 4.17. 

(b) Prove that the analogs of the properties given in Exercise 4.10 
hold for (@/7)q. 

(c) Prove that 


(2). =] < s xt=amod 7 is solvable in Z[i}. 
T 


In this exercise we will study the integer congruence x4 =a mod p, 

where p =1 mod 41s prime and a is an integer not divisible by p. 

(a) Write p =77 in Z[i]. Then use (4.20) to show that (a/m)4” = 
(a/p), and conclude that (a/7), = +1 if and only if (a/p) = 1. 
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4.20. 


4.21. 


4.22. 
4.23. 


4.24. 
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(b) Verify the partition of (Z/pZ)* described in the discussion fol- 
Jowing (4.20). 


Here we will study the congruence x4 =a mod p when p = 3 mod 4 
is prime and a is an integer not divisible by p. 


(a) Use (4.20) to show that (a/p)4 = 1. Thus a is a fourth power 
modulo p in the ring Z[7]. 

(b) Show that a is the biquadratic residue of an integer modulo p if 
and only if (a/p) = 1. Hint: study the maps ¢;(x) = x? on an 
Abelian group of order 2m, m odd. 


If a prime 7 of Z[/] is not associate to 1+1, then show that a unique 
associate of m is primary. 


Prove the top formula of (4.22). 


Use the supplementary laws (4.22) to prove part (i) of Theorem 
4.23. 


Let p =1 mod 4 be prime, and write p = a* +b’, where a is odd 
and b is even. The goal of this exercise is to present Dirichlet’s ele- 
mentary proof that (2/7), = i79/, where m =a + bi. 


(a) Use quadratic reciprocity for the Jacobi symbol to prove that 
(a/p)=1. 
(b) Use 2p = (a+b) +(a—by and quadratic reciprocity to show 


that 
€ + *) Z( 1y(@tby-1)/8, 
P 


(c) Use (b) and (4.20) to show that 
atb\ _(i\ sap 
p ae 
Hint: —1 = 77. 


(d) From (a + b)* = 2ab mod p, deduce that 
(i) (a + b)P-Y/? = (2abyP-Y/4 mod p. 


(ii) (a + b/ p) = (2ab/7)4. 
(e) Show that 2ab = 2a7i mod 7, and then use (a) and Exercise 4.19 


to show that 
2ab 21 
et \ee 


(f) Combine (c), (d) and (e) to show that (2/7), = i29/, 


4.25. 


4.26. 


4.27. 


4.28. 
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In this exercise we will study Gauss’ statement of biquadratic reci- 

procity. 

(a) If 7 is a primary prime of Z[/], then show that either 7 = 1 mod 
4 or m =3 + 21 mod 4. 


(b) Let a and @ be distinct primary primes in Z[z]. Show that bi- 
quadratic reciprocity is equivalent to the following two state- 
ments: 


If either 7 or 6 is congruent to 1 modulo 4, 

then (1/0), = (0/7 )4. 

If a and 6 are both congruent to 3 + 2 modulo 4, 

then (7/0@)4 = —(0/7)4. 
This is how Gauss states biquadratic reciprocity in [42, Vol. II, 
867, p. 138]. 


If 2 is a biquadratic residue modulo an odd prime p, prove that 
p=+1 mod 8. 


In this exercise, we will present Gauss’ proof that for a prime p= 
1 mod 8, the biquadratic character of 2 is determined by the decom- 
position p = a? + 2b. As usual, we write p = 17 in Z[i]. 

(a) Show that (—1/7)4 = 1 when p =1 mod 8. 

(b) Use the properties of the Jacobi symbo! to show that 


(G) sem 


(c) Use the Jacobi symbo!] to show that (b/p) = 1. Hint: write b = 
2™c, c odd, and first show that (c/p) = 1. 


(d) Show that 
6 eee aa, 


Hint: use Exercise 4.19. 
Combining (c) and (d), we see that (2/7), = (-1)@-/8, and Gauss’ 
claim follows. If you read Gauss’ original argument [42, Vol. II, §13], 
you'll appreciate how much the Jacobi symbol] simplifies things. 


Let (f,A) and (f, 1) be periods, and write (f,) = (#1 +---¢4". Then 
prove that 


f 
(f,A): (fp) = Ga + pj). 


j=l 
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Let p = 1 mod 3 be prime, and set p— 1 = 3f. Let (f,1), (f,g) and 
(f,g7) be the periods as in the text. Recall that g is a primitive root 
modulo p. In this problem we will describe Gauss’ proof of (4.24) 
(see [41, §358]). For 7, j € {0,1,2}, let (j) be the number of pairs 
(m,n), 0<m,n< f —1, such that 


1 + gomti god mod p. 


(a) Show that the number of solutions modulo p of the equation 
x>— y3=1 mod p is N = 9(00) + 6. 
(b) Use Exercise 4.28 to show that 
(f.1)-(f,1) =f + (00)(f,1) + (O(F,g) + (02)(F,87) 
(f,1)-(f.8) = QO)(F,1) + (11)(.8) + 027.8") 


and conclude that (00) + (01) + (02) = . —1 and (10)+(11)+ 
(12) = f. Hint: (f ,0) = f and —1 = (-1). 

(c) Show that (10) = (22), (11) = (20) and (12) = (21). Hint: expand 
(f,g)-(f,1) and compare it to what you got in (b). 

(d) Arguing as in (c), show that the 9 quantities (17) reduce to three: 


a@ = (12) = (21) = (00) +1 
6 = (01) = (10) = (22) 
7 = (02) = (20) = (11). 


(e) Note that (f,1)-(f,g)-(f,g7) is an integer. By expanding this 
quantity in terms of a, § and ¥, show that 


a +B +7-a=ah+ pytay. 
(f) Using (e), show that 
(6a — 38 — 3y — 2)? + 27(B — 7) = 12(a + B+ Y7)—4. 


(g) Recall thata+ (6+7=f (this was proved in (b)) and that p— 
1 = 3f . Then use (e) to show that 


4p =a? + 27b’, 


where a = 6a — 36 —3y-—2 and b= f-y7. 
(h) Let a be as in (g). Show that 


a=9a-3(at+B+7)-—2=9a- p-1. 
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Then use @ = (00) + 1 and (a) to conclude that 
a=N—pt2. 


This proves (4.24). 
In his first memoir on biquadratic residues [42, Vol. II, §§15-20, pp. 
78-89], Gauss used the (ij)’s (without using Gauss sums) to deter- 
mine the biquadratic character of 2. 


CHAPTER TWO 


CLASS FIELD THEORY 


§5. THE HILBERT CLASS FIELD AND p = x* + ny? 


In Chapter One, we used elementary techniques to study the primes repre- 
sented by x? + ny”, n>0. Genus theory told us when p = x? + ny? for a 
large but finite number of n’s, and cubic and biquadratic reciprocity enabled 
us to treat two cases where genus theory failed. These methods are lovely 
but limited in scope. To solve p = x* + ny? when n > 0 is arbitrary, we will 
need class field theory, and this is the main task of Chapter 
Two. But rather than go directly to the genera] theorems of class field 
theory, in §5 we will first study the special case of the Hilbert class 
field. Theorem 5.1 below will use Artin Reciprocity for the Hilbert class 
field to solve our problem for infinitely many (but not all) n>0. We 
will then study the case p = x? + 14y? in detail. This is a case where our 
previous methods failed, but once we determine the Hilbert class field 
of Q(/—14), Theorem 5.1 will immediately give us a criterion for when 
p=x?+14y?. 

The central] notion of this section is the Hilbert class field of a number 
field K. We do not assume any previous acquaintance with this topic, for 
one of our goals is to introduce the reader to this more accessible part of 
class field theory. To see what the Hilbert class field has to do with the 
problem of representing primes by x? + ny, let’s state the main theorem 
we intend to prove: 
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Theorem 5.1. Let n > 0 be an integer satisfying the following condition: 
(5.2) vn squarefree, n #3 mod 4. 


Then there is a monic irreducible polynomial f,(x) € Z[x] of degree h(—4n) 
such that if an odd prime p divides neither n nor the discriminant of f(x), 
then 


2 2 | (—n/p) = 1 and Tn(X) = 0 mod p 
pH=x +ny = 


has an integer solution. 


Furthermore, f,(x) may be taken to be the minimal polynomial of a real alge- 
braic integer a for which L = K(«) 1s the Hilbert class field of K = Q(,\/—n). 


While (5.2) does not give all integers n > 0, it gives infinitely many, so 
that Theorem 5.1 represents some real progress. In 89 we will use the full 
power of class field theory to prove a version of Theorem 5.1 that holds for 
all positive integers n. 


A. Number Fields 


We will review some basic facts from algebraic number theory, including 
Dedekind domains, factorization of ideals, and ramification. Most of the 
proofs will be omitted, though references will be given. Readers looking for 
a more complete treatment should consult Borevich and Shafarevich [8], 
Lang [72] or Marcus [77]. For an especially compact presentation of this 
material, see Ireland and Rosen [59, Chapter 12]. 

To begin, we define a number field K to be a subfield of the complex 
numbers C which has finite degree over Q. The degree of K over Q is 
denoted [K :Q]. Given such a field K, we let Ox denote the algebraic 
integers of K, 1.e., the set of all a € K which are roots of a monic integer 
polynomial. The basic structure of Ox is given in the following proposition: 


Proposition 5.3. Let K be a number field. 
(i) Ox ts a subring of C whose field of fractions is K. 
(ii) Ox is a free Z-module of rank [K: Q]. 


Proof. See Borevich and Shafarevich [8, §2.2] or Marcus [77, Corollaries to 
Theorems 2 and 9]. Q.E.D. 


We will often call Ox the number ring of K. To begin our study of Ox, 
we note that part (11) of Proposition 5.3 has the following useful conse- 
quence concerning the ideals of Ox: 
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Corollary 5.4. If K is a number field and a is a nonzero ideal of Ox, then 
the quotient ring Ox /a is finite. 


Proof. See Exercise 5.1. Q.E.D. 


Given a nonzero ideal a of the number ring Ox, its norm is defined to 
be N(a) = |Ox/a|. Corollary 5.4 guarantees that N(a) is finite. 

When we studied the rings Z[w] and Z[z] in §4, we used the fact that they 
were unique factorization domains. In general, the rings Ox are not UFDs, 
but they have another property which is almost as good: they are Dedekind 
domains. This means the following: 


Theorem 5.5. Let Ox be the ring of integers in a number field K. Then Ox 
is a Dedekind domain, which means that 
(i) Ox is integrally closed in K, i.e, if a € K satisfies a monic polynomial 
with coefficients in Ox, then a € Ox. 
(ii) Ox is Noetherian, i.e. given any chain of ideals a, C a2 C-::, there ts 
an integer n such that dy = Qn41 = °°. 
(iii) Every nonzero prime ideal of Ox is maximal. 


Proof. The proof of (i) follows easily from the properties of algebraic inte- 
gers (see Lang [72, 81.2] or Marcus [77, Exercise 4 to Chapter 2]), while (ii) 
and (iii) are straightforward consequences of Corollary 5.4 (see Exercise 
5.1). Q.E.D. 


The most important property of a Dedekind domain is that it has unique 
factorization at the level of ideals. More precisely: 


Corollary 5.6. [f K is a number field, then any nonzero ideal a in Ox can 
be written as a product 


a=pi-°'p, 


of prime ideals, and the decomposition is unique up to order. Furthermore, 
the p;’s are exactly the prime ideals of Ox containing a. 


Proof. This corollary holds for any Dedekind domain. For a proof, see Lang 
[72, §1.6] or Marcus [77, Chapter 3, Theorem 16]. In Ireland and Rosen [59, 
§12.2] there is a nice proof (due to Hurwitz) that is special to the number 
field case. Q.E.D. 


Prime ideals play an especially important role in algebraic number the- 
ory. We will often say “prime” rather than “nonzero prime ideal”, and the 
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terms “prime of K” and “nonzero prime ideal of Ox” will be used inter- 
changeably. Notice that when p is a prime of K, the quotient ring Ox/p 
is a finite field by Corollary 5.4 and Theorem 5.5. This field is called the 
residue field of p. 

Besides ideals of Ox, we will also use fractional ideals, which are the 
nonzero finitely generated Ox-submodules of K. The name “fractional ide- 
al” comes from the fact that such an ideal can be written in the form aa, 
where a € K and a is an ideal of Ox (see Exercise 5.2). Readers unfamiliar 
with fractional ideals should consult Marcus [77, Exercise 31 to Chapter 3]. 
The basic properties of fractional ideals are: 


Proposition 5.7. Let a be a fractional Ox-ideal. 

(i) a is invertible, t.e., there is a fractional Ox-ideal 6 such that ab = Ox. 
The ideal 6 will be denoted a. 

(ii) a can be written uniquely as a product a=[J).,~"', 1 € Z, where the 
p,’s are distinct prime ideals of Ox. 


Proof. See Lang [72, 81.6] or Marcus [77, Exercise 31 to Chapter 3]. 
Q.E.D. 


We will let Jx denote the set of all fractional ideals of K. Ix is closed 
under multiplication of ideals (see Exercise 5.2), and then part (i) of Propo- 
sition 5.7 shows that Jx is a group. The most important subgroup of Jx is 
the subgroup Px of principal fractional ideals, i.e., those of the form aOx 
for some a € K*. The quotient [x /Px is the ideal class group and is de- 
noted by C(Ox). A basic fact is that C(Ox) is a finite group (see Borevich 
and Shafarevich [8, §3.7] or Marcus [77, Corollary 2 to Theorem 35]). In 
the case of imaginary quadratic fields, we will see in Theorem 5.30 that the 
ideal class group is closely related to the form class group defined in §3. 

We will next introduce the idea of ramification, which is concerned with 
the behavior of primes in finite extensions. Suppose that K is a number 
field, and let L be a finite extension of K. If p is a prime ideal of Ox, then 
pO, is an ideal of O,, and hence has a prime factorization 


pO, = i ae a 
where the ‘B,’s are the distinct primes of L containing p. The integer e,, 
also written @y \,, is called the ramification index of p in ‘B,. Each prime 
8, containing p also gives a residue field extension Ox/p C O, /%B,, and 
its degree, written f, or fijp, is the inertial degree of p in ‘B,. The basic 
relation between the e,’s and f,’s is given by 


Theorem 5.8. Let K C L be number fields, and let p be a prime of K. If 
e, (resp. f,), 1 =1,...,g are the ramification indices (resp. inertial degrees) 
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8 
Seif =[L: K]. 
i=1 


Proof. See Borevich and Shafarevich [8, §3.5] or Marcus [77, Theorem 21]. 
Q.E.D. 


In the above situation, we say that a prime p of K ramifies in L if any of 
the ramification indices e; are greater than 1. It can be proved that only a 
finite number of primes of K ramify in L (see Lang [72, 8111.2] or Marcus 
[77, Corollary 3 to Theorem 24]). 

Most of the extensions K C L we will deal with will be Galois ex- 
tensions, and in this case the above description can be simplified as fol- 
lows: 


Theorem 5.9. Let K C L be a Galois extension, and let p be prime in K. 

(i) The Galois group Gal(L/K) acts transitively on the primes of L con- 
taining p, ie, if J and P' are primes of L containing p, then there is 
o € Gal(L/K) such that o() = f’. 

(ii) The primes Py,...,3B, of L containing p all have the same ramification 
index e and the same inertial degree f , and the formula of Theorem 5.8 
becomes 


efg =[L:K]. 


Proof. For a proof of (i), see Lang [72, 81.7] or Marcus [77, Theorem 23]. 
The proof of (ii) follows easily from (i) and is left to the reader (see Exer- 
cise 5.3). Q.E.D. 


Given a Galois extension K C L, an ideal p of K ramifies if e > 1, and 
is unramified if e = 1. If p satisfies the stronger condition e = f = 1, we 
say that p splits completely in L. Such a prime is unramified, and in addition 
pO, is the product of [L: K] distinct primes, the maximum number allowed 
by Theorem 5.9. In 88 we will show that L is determined uniquely by the 
primes of K that split completely in L. 

We will also need some facts concerning decomposition and inertia 
groups. Let K C L be Galois, and let 8 be a prime of L. Then the decom- 
position group and inertia group of $8 are defined by 


Dg = {o € Gal(L/K): 0(B) = P} 
Ig = {0 € Gal(L/K): 0(a@) = a mod Ff for all a € Oz}. 
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It is easy to show that Jy C Dg and that an element o € Dg induces an 
automorphism & of O;, /$8 which is the identity on Ox/p, p = BN Ox (see 
Exercise 5.4). If G denotes the Galois group of Ox/p C Ox /, it follows 
that 6 € G. Thus the map g++ 6 defines a homomorphism Dy — G whose 
kernel is exactly the inertia group J (see Exercise 5.4). Then we have: 


Proposition 5.10. Let Dy, Ig and G be as above. 
(i) The homomorphism Dg — G is surjective. Thus Dy/Ig ~ G. 
(ii) [Z| = epip and |Dp| = expip fapip- 


Proof. See Lang [72, 81.7] or Marcus [77, Theorem 28]. Q.E.D. 


The following proposition will help us decide when a prime is unramified 
or split completely in a Galois extension: 


Proposition 5.11. Let K C L be a Galois extension, where L = K(a) for 
some a € Oy. Let f(x) be the monic minimal polynomial of a over K, so 
that f (x) € Ox[x]. If p is prime in Ox and f (x) is separable modulo p, then 
(i) p is unramified in L. 
(ii) If f(x)=fi(x)::-fe(x) mod p, where the f;(x) are distinct and trre- 
ducible modulo p, then PB; = pO, + fi(a)Ox is a prime ideal of Ox, 
2; # PB; fori # Jj, and 


pOy = Pi-+ Peg. 


Furthermore, all of the f;(x) have the same degree, which ts the inertial 
degree f. 

(iii) p splits completely in L if and only if f (x) =0 mod p has a solution in 
Ox. 


Proof. Note that (i) and (iii) are immediate consequences of (ii) (see Exer- 
cise 5.5). To prove (ii), note that f(x) separable modulo p implies that 


f(x) = fi(x)-+- fg(x) mod p, 


where the fj(x) are distinct and irreducible modulo p. The fact that the 
above congruence governs the splitting of p in Oy is a general fact that 
holds for arbitrary finite extensions (see Marcus [77, Theorem 27]). How- 
ever, the decomposition group from Proposition 5.10 makes the proof in 
the Galois case especially easy. See Exercise 5.6. Q.E.D. 
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B. Quadratic Fields 


To better understand the theory just sketched, let’s apply apply it to the case 
of quadratic number fields. Such a field can be written uniquely in the form 
K = Q(VN), where N # 0,1 is a squarefree integer. The basic invariant of 
K is its discriminant dx , which is defined to be 


N if N =1 mod 4 
dx = 


(5.12) 
4N otherwise. 


Note that dx =0,1 mod 4 and K = Q(/dx), so that a quadratic field is 
determined by its discriminant. 

The next step is to describe the integers Ox of K. Writing K = Q(/N), 
N squarefree, one can show that 


Z[VN], N #1 mod 4 
(5.13) Ox = , + /N 


N =1 mod 4 
A), stm 


(see Exercise 5.7 or Marcus [77, Corollary 2 to Theorem 1]). Hence the 
rings Z[w] and Z[i] from §4 are the full rings of integers in their respective 
fields. Using the discriminant, this description of Ox may be written more 
elegantly as follows: 


(5.14) Ox=1 Ps] 


2 


(see Exercise 5.7). 

We can now explain the restriction (5.2) made on n in Theorem 5.1. 
Namely, given n > 0, let K be the imaginary quadratic field Q(\/—n). Then 
(5.12) and (5.13) imply that 


(5.15) dk =—4n <> Ox =21[V—-n] <=> n satisfies (5.2) 


(see Exercise 5.8). Thus the condition (5.2) on n is equivalent to Z[,/—n] 
being the full ring of integers in K. For other n’s, we will see in §7 that 
Z[/—n] is no longer a Dedekind domain but still has a lot of interesting 
structure. 

We next want to discuss the arithmetic of a quadratic field K. As in 84, 
this means describing units and primes, the difference being that “prime” 
now means “prime ideal”. Let’s first consider units. Quadratic fields come 
in two flavors, real (dx >0) and imaginary (dx <0), and the units O; 
behave quite differently in the two cases. In the imaginary case, there are 
only finitely many units. In §4 we computed O% for K = Q(/—3) or Q(i), 
and for all other imaginary quadratic fields it turns out that OF} = {+1} 
(see Exercise 5.9). On the other hand, real quadratic fields always have 
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infinitely many units, and determining them is related to Pell’s equation 
and continued fractions (see Borevich and Shafarevich [8, §2.7]). 

Before describing the primes of Ox, we will need one useful bit of 
notation: if D =0,1 mod 4, then the Kronecker symbol (D/2) is defined 
by 

0 if D=0 mod 4 


(3) = 1 if D=1 mod 8 
—1 if D=5 mod 8. 


We will most often apply this when D = dx is the discriminant of a quadrat- 
ic field K . The following proposition tells us about the primes of quadratic 
fields: 


Proposition 5.16. Let K be a quadratic field of discriminant dx, and let the 
nontrivial automorphism of K be denoted a+ a'. Let p be prime in 7. 
(i) If (dx/p)=0 (i.e, p| dx), then pOx = p* for some prime ideal p of 
Ox. 
(ii) If (dx/p) =1, then pOx = pp’, where p # p' are prime in Ox. 
(iii) If (dk /p) = —1, then pOx is prime in Ox. 
Furthermore, the primes in (i)-(iii) above give all nonzero primes of Ox. 


Proof. To prove (i), suppose that p is an odd prime dividing dx, and let p 
be the ideal 


p= pOx + VdxOx. 


Squaring, one obtains 


p? = p*Ox + pV dxOx + dxOx. 


However, dx is squarefree (except for a possible factor of 4) and p is an 
odd divisor, so that gcd(p’,dx) = p. It follows easily that p= pOk, and 
then the relation efg =[K :Q]=2 from Theorem 5.9 implies that p is a 
prime ideal. The case when p = 2 is similar and is left as part of Exercise 
5.10. 

Let’s next prove (ii) and (iii) for an odd prime p not dividing dx. The 
key tool will be Proposition 5.11. Note that f(x) = x? —dx is the minimal 
polynomial of the primitive element /dx of K over Q, and since p} dx, 
f (x) is separable modulo p. Then Proposition 5.11 shows that p is unram- 
ified in K. 

If (dx/p) =1, then the congruence x? = dx mod p has a solution, and 
consequently p splits completely in K by part (iii) of Proposition 5.11, ice., 
pOx => 2 for distinct primes p, and pz of Ox. Since Gal(K /Q) acts 
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transitively on the primes of K containing p (Theorem 5.9), we must have 
p,’ = p>, and it follows that pOx factors as claimed. If (dx /p) = —1, then 
f(x) = x? — dx is irreducible modulo p, and hence by part (ii) of Proposi- 
tion 5.11, pOx is prime in K. 

The proof of (ii) and (iii) for p =2 is similar and is left as an exer- 
cise (see Exercise 5.10). It remains to prove that the prime ideals listed so 
far are all nonzero primes in Ox. The argument is analagous to what we 
did in Proposition 4.7, and the details are left to the reader (see Exercise 
5.10). Q.E.D. 


From this proposition, we get the following immediate corollary which 
tells us how primes of Z behave in a quadratic extension: 


Corollary 5.17. Let K be a quadratic field of discriminant dx, and let p be 
an integer prime. Then: 

(i) p ramifies in K if and only if p divides dx. 

(ii) p splits completely in K if and only if (dx/p) = 1. Q.E.D. 


C. The Hilbert Class Field 


The Hilbert class field of a number field K is defined in terms of the un- 
ramified Abelian extensions of K. To see what these terms mean, we begin 
with the “Abelian” part. This is easy, for an extension K C L is Abelian ig it 
is Galois and Gal(L/K) is an Abelian group. But we aren’t quite ready to 
define “unramified”, for we first need to discuss the ramification of infinite 
primes. 

Prime ideals of Ox are often called finite primes to distinguish them 
from the infinite primes, which are determined by the embeddings of K 
into C. A real infinite prime is an embedding 0: K —R, while a complex 
infinite prime is a pair of complex conjugate embeddings 0,0: K > C, 
o #0. Given an extension K C L, an infinite prime o of K ramifies in L 
provided that o is real but it has an extension to L which is complex. 
For example, the infinite prime of Q is unramified in Q(V/2) but ramified 
in Q(V-2). 

An extension K C L is unramified if it is unramified at all primes, finite 
or infinite. While this is a very strong restriction, it can still happen that a 
given field has unramified extensions of arbitrarily high degree (an example 
is K = Q(V—2-3-5-7-11- 13), a consequence of the work of Golod and 
Shafarevich on class field towers—see Roquette [85]). But if we ask for 
unramified Abelian extensions, a much nicer picture emerges. In §8 we will 
use class field theory to prove the following result: 
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Theorem 5.18. Given a number field K, there is a finite Galois extension L 
of K such that: 


(i) L is an unramified Abelian extension of K. 
(ii) Any unramified Abelian extension of K lies in L. Q.E.D. 


The field L of Theorem 5.18 is called the Hilbert class field of K . It is the 
maximal unramified Abelian extension of K and is clearly unique. 

To unlock the full power of the Hilbert class field L of K , we will use the 
Artin symbol to link L to the ideal structure of Ox. The following lemma 
is needed to define the Artin symbol: 


Lemma 5.19. Let K C L be a Galois extension, and let p be a prime of Ox 
which is unramified in L. If 8 is a prime of Oy containing p, then there is 
a unique element 0 € Gal(L/K) such that for all a € O_, 


a(a) = a) mod §, 


where N(p) = |Ox/p| is the norm of p. 


Proof. As in Proposition 5.10, let Dy and Ig be the decomposition and 
inertia groups of 8. Recall that o € Dg induces an element 6 € G, where 


G is the Galois group of O, /$8 over Ox/p. Since p is unramified in L, 
part (ii) of Proposition 5.10 tells us that |Jp| = eg), = 1, and then the first 
part of the proposition implies that a +> 6 defines an isomorphism 


Dy > G. 


The structure of the Galois group G is well known: if Ox /p has q elements, 
then G isa cyclic group with canonical generator given by the Frobenius 
automorphism x + x? (see Hasse [50, pp. 40-41]). Thus there is a unique 
og € Dg which maps to the Frobenius element. Since g = N(p) by defini- 
tion, o satisfies our desired condition 


a(a) =a%) mod for all a € Oy. 
To prove uniqueness, note that any o satisfying this condition must lie in 


Dg, and then we are done. Q.E.D. 


The unique element o of Lemma 5.19 is called the Artin symbol and 
is denoted ((L/K)/B) since it depends on the prime $ of L. Its crucial 
property is that for any a € O;,,, we have 


(5.20) (=) (a) =a) mod #, 
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where p = $39 Ox. The Artin symbol ((L/K)/8) has the following useful 
properties: 


Corollary 5.21. Let K Cc L be a Galois extension, and let p be an unramified 
prime of K. Given a prime $8 of L containing p, we have: 
(i) If o € Gal(L/K), then 


(sy) = °( Ao 


(ii) The order of ((L/K)/B) is the inertial degree f = frpjp- 
(iii) p splits completely in L if and only if ((L/K)/) = 1. 


Proof. The proof of (i) is a direct consequence of the uniqueness of the 
Artin symbol. The details are left to the reader (see Exercise 5.12). 

To prove (ii), recall from the proof of Lemma 5.19 that since p is un- 
ramified, the decomposition group Dx is isomorphic to the Galois group 
of the finite extension Ox/p C O,/$8 whose degree is the inertial degree 
f . By definition, the Artin symbol maps to a generator of the Galois group, 
so that the Artin symbol has order f as desired. 

To prove (iii), recall that p splits completely in L if and only if e = f = 
1. Since we’re already assuming that e = 1, (iii) follows immediately from 
(ii). Q.E.D. 


When K c L is an Abelian extension, the Artin symbol ((L/K)/B) de- 
pends only on the underlying prime p = BMOx. To see this, let PB’ be 
another prime containing p. We’ve seen that $B’ = o($) for some a € 
Gal(L/K). Then Corollary 5.21 implies that 


(8) - (B= (8) 


since Gal(L/K) is Abelian. It follows that whenever K C L is Abelian, the 
Artin symbol can be written as ((L/K)/p). 

To see the relevance of the Artin symbol to reciprocity, let’s work out an 
example. Let K = Q(/—3) and L = K(V2). Since Ox is the ring Z[w] of 
§4, it’s a PID, and consequently a prime ideal p can be written as 7Z[w], 
where 7 is prime in Z[w]. If a doesn’t divide 6, it follows from Proposi- 
tion 5.11 that 7 is unramified in L (see part (a) of Exercise 5.14). Since 
Gal(L/K) ~ Z/3Z is Abelian, we see that ((L/K)/7) is defined. To deter- 
mine which automorphism it is, we need only evaluate it on 2. The answer 
is very nice: 


(5.22) (==) (v2) = (=). 


1 
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So the Artin symbol generalizes the Legendre symbol! To prove this, let 
be a prime of O, containing 7. Then, by (5.20), 


(2) (v2) = (v2)" moa 
= 20M-D/.72 mod $. 


However, we know from (4.10) that 
o(N(m)-1)/3 = (=) ee 
a ba 
and then 7 € 8 implies 
(==) (v2) = (=),¥2 mod $B. 
™ ™ 


Since ((L/K)/7)(V2) equals V2 times a cube root of unity (which are dis- 
tinct modulo $$—see part (a) of Exercise 5.13), (5.22) is proved. In Exercise 
5.14, we will generalize (5.22) to the case of the nth power Legendre sym- 
bol. 

When K C L is an unramified Abelian extension, things are especially 
nice because ((L/K)/p) is defined for all primes p of Ox. To exploit this, 
let Ix be the set of all fractional ideals of Ox. As we saw in Proposition 
5.7, any fractional ideal a € Ix has a prime factorization 


r 
a= | [p%, rj EZ, 
i=1 


and then we define the Artin symbol ((L/K )/a) to be the product 
(=) : TI (EY 
: ar a 


The Artin symbol thus defines a homomorphism, called the Artin map, 
L/K 
(==) >In —> Gal(L/K). 


Notice that when K C L is ramified, the Artin map is not defined on all of 
Ix. This is one reason why the general theorems of class field theory are 
complicated to state. 

The Artin reciprocity theorem for the Hilbert class field relates the Hilbert 
class field to the ideal class group C(Ox) as follows: 
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Theorem 5.23. If L is the Hilbert class field of a number field K, then the 
Artin map 


(=) Ik —> Gal(L/K) 


is surjective, and its kernel is exactly the subgroup Px of principal fractional 
ideals. Thus the Artin map induces an isomorphism 


C(Ox) -~+ Gal(L/K). QED. 


This theorem will follow from the results of §8. The appearance of the class 
group C(Ox) explains why L is called a “class field”. 

If we apply Galois theory to Theorems 5.18 and 5.23, we get the fol- 
lowing classification of unramified Abelian extensions of K (see Exercise 
S17): 


Corollary 5.24. Given a number field K, there is a one-to-one correspon- 
dence between unramified Abelian extensions M of K and subgroups H of the 
ideal class group C(Ox). Furthermore, if the extension K C M corresponds 
to the subgroup H Cc C(Ox), then the Artin map induces an isomorphism 


C(Ox)/H > Gal(M/K). Q.E.D. 


This corollary is class field theory for unramified Abelian extensions, and 
it illustrates one of the main themes of class field theory: a certain class of 
extensions of K (unramified Abelian extensions) are classified in terms of 
data intrinsic to K (subgroups of the ideal class group). The theorems we 
encounter in §8 will follow the same format. 

Theorem 5.23 also allows us to characterize the primes of K which split 
completely in the Hilbert class field: 


Corollary 5.25. Let L be the Hilbert class field of a number field K, and let 
p be a prime ideal of K. Then 


p splits completely in L <=> p ts a principal ideal. 


Proof. By Corollary 5.21, we know that p splits completely if and only if 
((L/K)/p) = 1. Since the Artin map induces an isomorphism C(Ox) ~ 
Gal(L/K), we see that ((L/K)/p) = 1 if and only if p determines the triv- 
ial class of C(Ox). By the definition of the ideal class group, this means 
that p is principal, and the corollary is proved. Q.E.D. 


In §8, we will see that the Hilbert class field is characterized by the prop- 
erty that the primes that split completely are exactly the principal prime 
ideals. 
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D. Solution of p = x? + ny? for infinitely many 1 


Now that we know about the Hilbert class field, we can prove Theorem 5.1: 


Proof of Theorem 5.1. The first step is to relate p = x* + ny? to the behav- 
ior of p in the Hilbert class field L. This result is sufficiently interesting to 
be a theorem in its own right: 


Theorem 5.26. Let L be the Hilbert class field of K = Q(,/—n). Assume that 
n satisfies (5.2), so that Ox = Z[,/—n]. If p is an odd prime not dividing n, 
then 

p=x?+ny* <> p splits completely in L. 


Proof. Since n satisfies (5.2), we know from (5.15) that dx = —4n and Ox 
= 7[./—n]. Let p be an odd prime not dividing n. Then p)dx, so that p 
is unramified in K by Corollary 5.17. We will prove the following equiva- 
lences: 


p=x*+ny* <> pOx = pp, p # p, and p is principal in Ox 
(5.27) <=> pOx = pp, p #p, and p splits completely in L 
<=> p splits completely in L, 


and Theorem 5.26 will follow. 

To prove the first equivalence, suppose that p = x* + ny? = (x + /—ny) 
x(x — /—ny). Setting p = (x + /—ny)Ox, then pOx = pp must be the 
prime factorization of pOx in Ox. Note that p # p since p is unramified in 
K. Conversely, suppose that pOx = pp, where p is principal. Since Ox = 
Z[\/—n], we can write p = (x + /—ny)Ox. This implies that pOx = (x? + 
ny”)Ox, and it follows that p = x2 + ny”. 

The second equivalence follows immediately from Corollary 5.25. To 
prove the final equivalence, we will use the following lemma: 


Lemma 5.28. Let L be the Hilbert class field of an imaginary quadratic field 
K, and let T denote complex conjugation. Then T(L) = L, and consequently 
L ts Galois over Q. 


Proof. It is easy to see that 7(L) is an unramified Abelian extension of 
T(K)=K. Since L is the maximal such extension, we have 7(L)C L, 
and then 7(L) = L since they have the same degree over K. Hence 7 € 
Gal(L/Q), which implies that L is Galois over Q (see Exercise 5.19). 

Q.E.D. 


To finish the proof of (5.27), note that the condition 


pOx = pp, p #p, and p splits completely in L 
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says that p splits completely in K and that some prime of K containing p 
splits completely in L. Since L is Galois over Q, this is easily seen to be 
equivalent to p splitting completely in L (see Exercise 5.18), and Theorem 
5.26 is proved. Q.E.D. 


The next step in the proof of Theorem 5.1 is to give a more elementary 
way of saying that p splits completely in L. We have the following criterion: 


Proposition 5.29. Let K be an imaginary quadratic field, and let L bea 
finite extension of K which is Galots over Q. Then: 


(i) There is a real algebraic integer a such that L = K(q). 
(ii) Given @ as in (i), let f (x) € Z[x] denote its monic minimal polynomial. 
If p is a prime not dividing the discriminant of f (x), then 


(dx/p) =1and f(x)=0mod p 


p splits completely in L <> | 
has an integer solution. 
Proof. By Lemma 5.28, L is Galois over Q, and thus [LN R: Q] =[L: K] 
since LMR is the fixed field of complex conjugation. This implies that for 
ae Loner, 
LOR = Q(a) —> L=K(a) 


(see Exercise 5.19). Hence, if a€ OL NR satisfies LAR = Q(a), then a is 
areal integral primitive element of L over K, and (i) is proved. Further- 
more, given such an a, let f(x) be its monic minimal polynomial over Q. 
Then f(x) € Z[x], and since [LNR: Q] =[L: K], f(x) is also the minimal 
polynomial of a over K. 

To prove the final part of (ii), let p be a prime not dividing the discrim- 
inant of f(x). This tells us that f(x) is separable modulo p. By Corollary 
5.17 we have 


_ 2 d 
pOx=pp, p#p <> (“) =1 


We may assume that p splits completely in K, so that Z/pZ ~ Ox/p. Since 
f(x) is separable over Z/pZ, it is separable over Ox/p, and then Proposi- 
tion 5.11 shows that 


p splits completely in L = > f(x) =0 mod p is solvable in Ox 
<=> f(x)=0 mod p is solvable in Z, 


where the last equivalence again uses Z/pZ~ Ox/p. The proposition now 
follows from the last equivalence of (5.27). Q.E.D. 
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We can now prove the main equivalence of Theorem 5.1. Since the 
Hilbert class field L of K = Q(./—n) is Galois over Q, Proposition 5.29 
implies that there is a real algebraic integer a@ which is a primitive element 
of L over K. Let f,(x) be the monic minimal polynomial of a, and let 
p be an odd prime dividing neither n nor the discriminant of f,,(x), then 
Theorem 5.26 and Proposition 5.29 imply that 


p=x'+ny* <> p splits completely in L 
| (—n/p)=1and f,(x)=0 mod p 


has an integer solution. 


In the second equivalence, recall that n satisfies (5.2), so that dx = —4n, 
and hence (dx/p) = (—n/p). 

It remains to show that the degree of f,(x) is the class number A(—4n). 
Using Galois theory and Theorem 5.23, it follows that f,(«) has degree 


[L: K] = |Gal(L/K)| = |C(Ox)|. 


In Theorem 5.30 below we will see that when dx < 0, there is a natural 


isomorphism 
C(Ox) ~ C(dx) 


between the ideal class group C(Ox) and the form class group C(dx ) from 
§3. Since dx = —4n in our case, we have |C(Ox)| = |C(—4n)| = A(—4n), 
which completes the proof of Theorem 5.1. Q.E.D. 


The polynomial f,(x) of Theorem 5.1 is not unique—there are lots of 
primitive elements. However, we can at least predict its degree in advance 
by computing the class number A(—4n). In §8 we will see that knowing 
fn(X) is equivalent to knowing the Hilbert class field. 

We have now answered our basic question of when p = x* + ny’, at 
least for those n satisfying (5.2). Notice that quadratic forms have almost 
completely disappeared! We used x* + ny? in Theorem 5.26, but otherwise 
all of the action took place using ideals rather than forms. This is typical 
of what happens in modern algebraic number theory—ideals are the dom- 
inant language. At the same time, we don’t want to waste the work done 
on quadratic forms in §§2-3. So can we translate quadratic forms into ide- 
als? In §7 we will study this question in detail. The full story is somewhat 
complicated, but the case of negative field discriminants rather nice: here, 
the form class group C(dx) from §3 is isomorphic to the ideal class group 
C(Ox). More precisely, we get the following theorem, which is a special 
case of the results of §7: 


Theorem 5.30. Let K be an imaginary quadratic field of discriminant 
dx <0. Then: 
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(i) If f(x,y) = ax? + bxy + cy? is a primitive positive definite quadratic 
form of discriminant dx, then 


[a,(—b + /dx)/2] = {ma + n(—b + V/dx)/2: mn € 7} 


is an ideal of Ox. 

(ii) The map sending f(x,y) to [a,(—b + V/dx)/2] induces an isomorphism 
between the form class group C(dx) of §3 and the ideal class group 
C(Ox). Hence the order of C(Ox) is the class number h(dx). Q.E.D. 


If we combine Theorems 5.30 and 5.23, we see that the Galois group 
Gal(L/K) of the Hilbert class field of an imaginary quadratic field K is 
canonically isomorphic to the form class group C(dx). Thus the “class” 
in “Hilbert class field” refers to Gauss’ classes of properly equivalent qua- 
dratic forms. 

This theorem allows us to compute ideal class groups using what we 
know about quadratic forms. For example, consider the quadratic field K = 
Q(/—-14) of discriminant —56. In §2 we saw that the reduced forms of 
discriminant —56 are x? + 14y*, 2x2 + 7y? and 3x2+2xy + 5y?. The form 
class group C(—56) is thus cyclic of order 4 since only x? + 14y? and 2x? + 
Ty* give classes of order <2. Then, using Theorem 5.30, we see that the 
ideal class group C(Ox) is isomorphic to Z/4Z, and furthermore, ideal class 
representatives are given by [1, /—14] = Ox, [2, V/—14] and [3,14 /—14]. 
See Exercises 5.20-5.22 for some other applications of Theorem 5.30. 

The final task of §5 is to work out an explicit example of Theorem 5.1. 
We will discuss the case p = x* + 14y’, which was left unresolved at the end 
of §3. Of course, we know from Theorem 5.1 that there is some polynomial 
fia(x) such that 


2 2 (—14/p) = 1 and fi4(x) = 0 mod p 
p=x'+14y*° <> 
has an integer solution, 


but so far all we know about f\4(x) is that it has degree 4 since h(—56) = 4. 
This illustrates one weakness of Theorem 5.1: it tells us that fi4(x) exists, 
but doesn’t tell us how to find it. To determine f\4(x), we need to know the 
Hilbert class field of Q(./—14). The answer is as follows: 


Proposition 5.31. The Hilbert class field of K = Q(/-—14) is L = K(aq), 
where a = V2V2-1. 


Proof. Since h(—56) = 4, the Hilbert class field has degree 4 over K. Then 
L = K(q) will be the Hilbert class field once we show that K CL is an 
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unramified Abelian extension of degree 4. It’s easy to see that K C L is 
Abelian of degree 4, so that we need only show that it is unramified. Fur- 
thermore, since K is imaginary quadratic, the infinite primes are automati- 
cally unramified. 

Note that a? = 2\/2—1, so that V2 € L. If we let K, = K(V2), then we 
have the extensions 


KcCK,CL, 


and it suffices to show that K Cc K; and K; C L are unramified (see Exer- 
cise 5.15). Since each of these extensions is obtained by adjoining a square 
root (K; = K(V2) and L = Ki(/p), p= 2/2 — 1), let’s first prove a gen- 
eral lemma about this situation: 


Lemma 5.32. Let L = K(./u) be a quadratic extension with u € Ox, and let 

p be prime in Ox. 

(i) If 2u ¢ p, then p is unramified in L. 

(ii) Jf2 ep, u¢ p and u = b* — 4c for some b,c € Ox, then p is unramified 
in L. 


Proof. (i) Since the discriminant of x?—u is 4u¢p, x*—u is separable 
modulo p. Thus p is unramified by Proposition 5.11. 

(ii) Note that L = K(@), where 8 = (—b + /u)/2 is a root of x7 + bx + 
c. The discriminant is b* — 4c = u ¢ p, so again p is unramified by Proposi- 
tion 5.11. Q.E.D. 


Now we can prove Proposition 5.31. To study K C Kj, let p be prime in 
Ox. Since K; = K(/2), part (i) of Lemma 5.32 implies that p is unramified 
whenever 2 ¢ p. It remains to study the case 2 € p. Since ~—14€ K and 
V2 € Ki, we also have /—7€ Kj, ie., Kj = K(/—7). Since —7¢ p and 
—7=1?— 4.2, p is unramified by part (ii) of Lemma 5.32. 

The extension K; C L is almost as easy. We know that L = Ki(,/p), 
p= 272-1. Let p’ = -2V/2—-1. Since /pp! = /—7€ Ky, it follows that 
Jp! € L, and in fact 


L = K(f) = Ki(/p). 


Now let p be prime in K,. If 2¢ p, then w+ p’ = —2 shows that p ¢ p or 
pt’ € p, and p is unramified by part (i) of Lemma 5.32. If 2€ p, then wp¢ p 
since p = 2/2 —1. We also have p = (1+ V2)* —4, and then part (ii) of 
Lemma 5.32 shows that p is unramified. Q.E.D. 


We can now characterize when a prime p is represented by x? + 14y?: 
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Theorem 5.33. If p # 7 is an odd prime, then 


p=xrtld4y? => (—14/p) = 1 and (x? + 1)? =8 mod p 
has an integer solution. 


Proof. Since a = 2/2 —1 is a real integral primitive element of the Hil- 
bert class field of K = Q(./—14), its minimal polynomial x4 +2x?-7= 
(x? + 1)? — 8 can be chosen to be the polynomial f,4(x) of Theorem 5.1. Its 
discriminant is —2!4- 7 (see Exercise 5.24), so that the only excluded primes 
are 2 and 7. Then Theorem 5.33 follows immediately from Theorem 5.1. 

Q.E.D. 


These methods can be used to compute the Hilbert class field in other 
cases (see Herz [56]). For example, in Exercise 5.25, we will see that the 


Hilbert class field of K = Q(/—17) is L = K(a), where a = \/(1+ V17)/2. 
This gives us an explicit criterion for a prime to be of the form x? + 17y? 
(see Exercise 5.26). 

One unsatisfactory aspect of these examples is that they don’t explain 
how the primitive element a of the Hilbert class field was found. In general, 
the Hilbert class field is difficult to describe explicitly, though this can be 
done for class numbers < 4 (see Herz [56]). In §6 we will use genus theory 
to discover the above primitive elements when K = Q(./—14) or Q(/—17), 
and in Chapter Three we will use complex multiplication to give a general 
method for finding the Hilbert class field of any imaginary quadratic field. 


E. Exercises 


5.1. Let Ox be the algebraic integers in a number field K. 

(a) Show that a nonzero ideal a of Ox contains a nonzero integer m. 
Hint: if a #0 is in a, let x” +a,x""!+---+ a, be its minimal 
polynomial. Show that m = a, is what we want. 

(b) Show that Ox/a is finite whenever a is a nonzero ideal of Ox. 
Hint: if m is the integer from (a), consider the surjection Ox/ 
mOx — Ox/a. Use part (ii) of Proposition 5.3 to compute the 
order of Ox/mOx. 

(c) Use (b) to show that every nonzero ideal of Ox is a free Z- 
module of rank [K: Q]. 

(d) If we have ideals a; C a2 C---, show that there is an integer n 
such that a, = a,4; =---. Hint: consider the surjections Ox/a; 
— Ox/a2—---, and use (b). 

(ce) Use (b) to show that a nonzero prime ideal of Ox is maximal. 
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5.3. 


5.4. 


5.5. 


5.6. 
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We will study the elementary properties of fractional ideals in a num- 
ber field K. Recall that a C K is a fractional ideal if, under ordinary 
addition and multiplication, it is a finitely generated Ox-module. 

(a) Show that a is a fractional ideal if and only if a = ab, where 
a€K and 6 is an ideal of Ox. Hint: write each generator of a 
in the form a/f, a, € Ox. Going the other way, use part (c) of 
Exercise 5.1 to show that ab is a finitely generated Ox-module. 

(b) Show that a nonzero fractional ideal is a free Z-module of rank 
[K : Q]. Hint: use (a) and part (c) of Exercise 5.1. 

(c) Show that the product of two fractional ideals is a fractional ideal. 


Let K Cc L be a Galois extension, and let p Cc 8 be prime ideals of K 
and L respectively. 


(a) If o € Gal(L/K), then prove that egpyp = epp and fopyp = 


fplp- 
(b) Prove part (ii) of Theorem 5.9. 


Let K C L be a Galois extension, and let $8 be prime in L. Then 

we have the decomposition group Dy = {0 € Gal(L/K) :0() = P} 

and the inertia group Ig = {0 € Gal(L/K):0(a)=a mod § for all 

ae Oz}. 

(a) Show that Tg Cc Dx ‘ 

(b) Show that o € Dg induces an automorphism 6 of Oz /B which is 
the identity on Ox/p, p= BNOk«. 

(c) Let 0 € Dg. Then show that o € Jy if and only if the automor- 
phism & from (b) is the identity. 


In Proposition 5.11, prove that parts (i) and (iii) are consequences of 

part (ii). 

In this exercise, we will prove part (ii) of Proposition 5.11. Let B® bea 

prime of O, containing p, and let Dy = {0 € Gal(L/K): o() = P} 

be the decomposition group. In Proposition 5.10 we observed that the 

order of Dy is ef , where e = eg), and f = fy. 

(a) Since f(x) = fi(x)-:: f(x) mod p, show that fj(a) € B for some 
i. We can assume that fi(a) € . 

(b) Using f = [Oz./ : Ox/p], prove that f > deg(fi(x)). 

(c) Since fi(a(a)) € B for all o € Dg, show that deg(fi(x)) > |Dx| 
= ef . Hint: this is where separability is used. 

(d) From (b) and (c) conclude that e = 1 and f = deg(fi(x)). Thus p 
is unramified in L. 

(ec) Show that pO, = $,---B, where P; is prime in O, and fi(a) € 
$B;. This shows that all of the f;(x)’s have the same degree. 


Sele 


5.8. 
5.9. 
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(f) Show that $B; is generated by p and fj(a). Hint: let J; = pO, + 
fi(a)Ox . Use F, c B; and f---I, C pOy to show J; = $;. 


In this problem we will determine the integers in the quadratic field 
K = Q(VWN), where N is squarefree. Let a++ a denote the nontriv- 
ial automorphism of K. 


(a) Given a = r+sV/N EK, define the trace and norm of a to be 
T(a)=a+a'=2r 
N(qa) = aa! = r*—s?N. 
Then prove that fora,GeEK, 
T(a + B) =T(a)+T7(8) 
N(a) = N(a)N(8)- 
(b) Given a € K, prove that a € Ox if and only if T(a), N(a) € Z. 
(c) Use (b) to prove the description of Ox given in (5.13). 
(d) Prove the description of Ox given in (5.14). 
Use (5.12) and (5.13) to prove (5.15). 


In this exercise we will study the units in an imaginary quadratic field 

K. Let N(q) be the norm of a € K from Exercise 5.7. 

(a) Prove that a € Ox is a unit if and only if N(q@) = 1. 

(b) Show that OF = {+1} unless K = Q(z) or QW), in which case 
Ox = {+1,+i} or {£1,tw,+w*} respectively. Hint: use (a) and 
(5.13). Exercises 4.5 and 4.16 will also be useful. 


5.10. Let K be a quadratic field of discriminant dx , and let the nontrivial 


automorphism of K be (a+ bV/dx)' = a—bV/dx. We want to com- 
plete the description of the prime ideals p of Ox begun in Proposi- 
tion 5.16. Our basic tools will be Proposition 5.11 and the formula 
efg = 2 from Theorem 5.9. 


(a) If 2| dx, then show that 20x = p?, p=p’ prime. Hint: write 
dx =A4N and set 


20x+(1+VN)Ox,  N odd 
7 20x + J/NOxk, N even. 
(b) If 2/dx, then show that 
dx =1mod8 <> 20x = pp’, p # p’ prime 
dx =5 mod 8 <> 20x is prime in Ox. 


Hint: apply Proposition 5.11 to K = Q(a), a = (1+ Vdx)/2. 
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5.11. 


5.12. 


5.13. 
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(c) Show that the ideals described in parts (i}(iii) of Proposition 
5.16 give all prime ideals of Ox. Hint: use norms to prove that 
any prime ideal p contains a nonzero integer m. Thus p | mOx, 
and we are done by unique factorization. 

Notice how these results generalize the descriptions given in Propo- 

sitions 4.7 and 4.18 of the primes in Z[w] and Z[i]. 


This problem will study the norm of a prime p in a number field K. 

Recall that the norm N(p) is defined by N(p) = |Ox/p|. Let p be 

the unique prime of Z contained in p. 

(i) Show that N(p) =p’, where f is the inertial degree of p 
over p. 


(ii) Now assume that p is prime in a quadratic field K . Show that 


p\|dx:N(p)=p 


DP; p splits completely in K 
pfdx : N(p) = 3 


P’, pOkx is prime in Ox. 
Hint: use efg = 2. 


This exercise is concerned with the Artin symbol ((L/K)/B). 

(a) Prove part (i) of Corollary 5.21. 

(b) Let K C L be a Galois extension and let p be a prime of K un- 
ramified in L. Prove that the set {((L/K)/8): B is a prime of 
L containing p} is a conjugacy class of Gal(L/K). This conju- 
gacy class is defined to be the Artin symbol ((L/K)/p) of p. 


Assume that the number field K contains a primitive nth root of 

unity ¢. In this problem we will discuss a generalization of the Leg- 

endre symbol. Let a € Ox and let p be a prime ideal of Ox such 

that na ¢ p. 

(a) Prove that 1, ¢,...,¢"~! are distinct modulo p. Hint: show that 
x” —1 is separable modulo p. 

(b) Use (a) to prove that n | N(p)-—1. 

(c) Show that a @)-)/" is congruent to a unique nth root of unity 
modulo p. This allows us to define the nth power Legendre sym- 
bol (a/p), to be the unique nth root of unity such that 


qN(p)-D)/n = (=), mod p. 


(d) Prove that (a/p), = 1 if and only if a is an nth power residue 
modulo p. 


5.14. 


5.15. 


5.17. 


5.19. 


E. EXERCISES 119 


Let K, n, a and p be as in the previous exercise, and let L = K(x/a). 

Note that L is an Abelian extension of K. In this problem we will 

relate the Legendre symbol (a/p), to the Artin symbol ((L/K)/p). 

(a) Show that p is unramified in L. Hint: show that x” — a is sepa- 
rable modulo p and use Proposition 5.11. 

(b) Generalize the argument of (5.22) to show that 


(HUE) n= (Sv 

p p 

Suppose that K C M Cc L are number fields. 

(a) Let p be prime in Ox, and assume that p c $c ’, where P 
(resp. B') is prime in Oy (resp. O,). Then show that eg), = 
CP |PEPlp- 

(b) Prove that a prime p of Ox is unramified in L if and only if 
p is unramified in M and every prime of Oy lying over p is 
unramified in L. 

(c) Prove that L is an unramified extension of K if and only if L is 
unramified over M and M is unramified over K. 


Let K CL be an unramified Abelian extension, and assume that 
K CM CL. By the previous exercise, K C M is unramified, and it 
is clearly Abelian. We thus have Artin maps 


(==) : Ix —> Gal(L/K) 


(==) Ix —+ Gal(M/K) 


and we also have the restriction map r: Gal(L/K)— Gal(M/K). 
Then use Lemma 5.19 to prove that 


(tH aro) 


Prove Corollary 5.24. Hint: besides Galois theory and Theorems 5.18 
and 5.23, you will also need Exercises 5.15 and 5.16. 


If K CM CL, where L and M are Galois over K, then prove that 
a prime p of Ox splits completely in L if and only if it splits com- 
pletely in M and some prime of Oy containing p splits completely 
in L. 


Let K be an imaginary quadratic field, and let K Cc L be a Galois 
extension. As usual, 7 will denote complex conjugation. 


(a) Show that L is Galois over Q if and only if 7(L) = L. 
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(b) If L is Galois over Q, then prove that 
Gj) [LNR: Q] =[L: K]. 
Gi) Forae LAR, LNR = Q(a) — L=K(a). 
Show that Z[(1 + /—19)/2] is a UFD. Hint: every PID is a UFD 
(see Ireland and Rosen [59, §1.3) or Marcus [77, pp. 255—256)). ‘Thus, 
by Theorem 5.30, it suffices to show that h(—19) = 1. 
In this exercise we will study the ring Z[/—2]. 
(a) Use Theorem 5.30 to show that Z[./—2] is a UFD. 
(b) Show that /—2 is a prime in Z[V—2]. 
(c) If ab =u in Z[—2] and a and b are relatively prime, then 
prove that a and b are cubes in Z[/—2]. 


We can now give a second proof of Fermat’s theorem that (x,y) = 


(3, +5) are the only integer solutions of the equation x? = y? +2. 

(a) If x? = y? +2, show that y + /—2 and y — /—2 are relatively 
prime in Z[,/—2]. Hint: use part (b) of Exercise 5.21. 

(b) Use part (c) of Exercise 5.21 to show that (x,y) = (3,+5). 


This argument is due to Euler [33, Vol. I, Chapter XII, §§191-193], 
though he assumed (without proof) that Exercise 5.21 was true. 


If D = 1 mod 4 is negative and squarefree, prove a version of Theo- 
rems 5.1 and 5.26 for primes of the form x? + xy +((1—D)/4)y?. 


Prove that the discriminant of x4+bx*+c equals 24c(b* — 4c)’. 

Hint: write down the roots explicitly. 

Let K = Q(V-17). 

(a) Show that C(Ox) =~ Z/4Z. 

(b) Show that the Hilbert class field of K is L = K(a), where a= 
\/ (1+ V17)/2. Hint: use the methods of Proposition 5.31. The 


only tricky part concerns primes of K(V17) which contain 2. 
Setting u = (1+ V17)/2 and uw’ = (1— V17)/2, note that u and 
u' satisfy x = x? — 4. 


Prove an analog of Theorem 5.33 for primes of the form x? + 17y?. 
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In Chapter One we studied the genus theory of primitive positive definite 
quadratic forms, and our main result (Theorem 3.15) was that for a fixed 
discriminant D: 
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(i) There are 24! genera, where p is the number defined in Proposition 
Sad 
(ii) The principal genus consists of squares of classes. 


In this section, we will use Artin reciprocity for the Hilbert class field of an 
imaginary quadratic field K to prove (i) and (ii) when D is the discriminant 
dx of K. This result is less general than what we proved in §3, but the 
proof is such a nice application of the Hilbert class field that we couldn’t 
resist including it. Readers more interested in p = x? + ny” may skip to §7 
without loss of continuity. 

The key to the class field theory interpretation of genus theory is the 
concept of the genus field. Given an imaginary quadratic field K of discrim- 
inant dx, Theorem 5.30 tells us that the form class group C(dx) is isomor- 
phic to the ideal class group C(Ox). The principal genus is a subgroup of 
C(dx) and hence maps to a subgroup of C(Ox). By Corollary 5.24, this 
subgroup determines an unramified Abelian extension of K which is called 
the genus field of K . Theorem 6.1 below will describe the genus field explic- 
itly and show that the characters used in Gauss’ definition of genus appear 
in the Artin map of the genus field. This will take a fair amount of work, 
but once done, (i) and (ii) above will follow easily by Artin Reciprocity. 
We will then discuss how the genus field can help in the harder problem of 
determining the Hilbert class field. 


A. Genus Theory for Field Discriminants 


Here is the main result of this section: 


Theorem 6.1. Let K be an imaginary quadratic field of discriminant dx. Let 

ps be the number of primes dividing dx, and let pj,...,pr be the odd primes 

dividing dx (so that p = r or r +1 according to whether dx =0 or 1 mod 4). 

Set p* = (—1)i—)/*p;. Then: 

(i) The genus field of K is the maximal unramified extension of K which is 
an Abelian extension of Q. 

(ii) The genus field of K is K(,/p},...,./Dj)- 

(iii) The number of genera of primitive positive definite forms of discriminant 
dx is 2-1, 

(iv) The principal genus of primitive positive definite forms of discriminant 
dx consists of squares of classes. 


Proof. First, note that for field discriminants dx, the number yp defined in 
the statement of the theorem agrees with the one defined in Proposition 
3.11 (see Exercise 6.1). Note also that (iii) and (iv) of the theorem are the 
facts about genus theory that we want to prove. 


122 §6. THE HILBERT CLASS FIELD AND GENUS THEORY 


To start the proof, let L be the Hilbert class field of K, and let M 
be the unramified Abelian extension of K corresponding to the subgroup 
C(Ox)* C C(Ox) via Corollary 5.24. We claim that 


(6.2) | M is the maximal unramified extension of K Abelian over Q. 


To prove this, consider an unramified extension M of K which is Abelian 
over Q. Then M is also Abelian over K , so that M C L, and we thus have 
the following diagram of fields: 


(6.3) Abelian 


Qs a = ee 


We want the maximal such M. Since L is Galois over Q (see Lemma 5.28), 
we can interpret (6.3) via Galois theory. Let G = Gal(L/Q). Then M be- 
ing Abelian over Q is equivalent to [G,G] C Gal(L/M), where [G,G] is 
the commutator subgroup of G (see Exercise 6.2). Note also that [G,G] C 
Gal(L/K) since the latter has index two in G. Thus M satisfies (6.3) if and 
only if 

[G, G] C Gal(L/M) c Gal(L/K). 


It follows by Galois theory that the maximal unramified extension of K 
Abelian over Q is the one that corresponds to [G,G]. By Theorem 5.23, 
Gal(L/K) can be identified with C(Ox) via the Artin map. If we can 
show that [G,G] C Gal(L/K) maps to C(Ox)* C C(Ox), then (6.2) will fol- 
low. 

We first compute G = Gal(L/@). We have a short exact sequence 


f= Gal(Lj/K\) => G= Galk/e) 1 


which splits because complex conjugation 7 is in G by Lemma 5.28. Thus 
G is the semidirect product Gal(L/K) x (Z/2Z), where Z/2Z acts by conju- 
gation by T. 

Under the isomorphism Gal(L/K) ~ C(Ox), conjugation by 7 operates 
on C(Ox) by sending an ideal to its conjugate. To see this, let p be a prime 
ideal of Ox. Then the uniqueness part of Lemma 5.19 shows that 


ER) 
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(see Exercise 6.3), and our claim follows. However, for any ideal a of Ox, 
we will prove in Lemma 7.14 that the product aa is always a principal ideal, 
and it follows that the class of @ is the inverse of the class of a in C(Ox). 
Hence G may be identified with the semidirect product C(Ox) x (Z/2Z), 
where Z/2Z acts by sending an element of C(Ox) to its inverse. 

It is now easy to show that [G,G] = C(Ox)*. First, note that C(Oxy* 
is normal in G (any subgroup of C(Ox) is, which has unexpected conse- 
quences—see Exercise 6.4), and since Z/2Z acts trivially on C(Ox)/C(Ox)* 
(every element is its own inverse), we have 


(6.5) G/C(Oxy* = (C(Ox) 4 (Z/2Z))/C(Oxy 
~ (C(Ox)/C(Ox)’) x (Z/22), 


so that G/C(Ox)* is Abelian (see Exercise 6.5). It follows that [G,G]C 
C(Ox)*. To prove the opposite inclusion, note that for any a € C(Ox), we 
have (a,1) € C(Ox) x (Z/2Z), and then 


(a,1)(1,7)(a,1)~1(1,7)~! = (a?,1), 


where 7 is the nontrivial element of Z/2Z. This proves that [G,G] = 
C(Ox)*, and (6.2) is proved. 
We will next show that 


(6.6) M = K(x/p%,.--,\/P¥)s 


where p;’s are as in the statement of the theorem. We begin with two pre- 
liminary lemmas. The first concerns some general facts about ramification 
and the Artin symbol: 


Lemma 6.7. Let L and M be Abelian extensions of a number field K, and 
let p be prime in Ox. 
(i) p is unramified in LM if and only if it is unramified in both L and M. 
(ii) If p is unramified in LM, then under the natural injection 
Gal(LM /K) — Gal(L/K) x Gal(M/K), 
the Artin symbol ((LM /K)/p) maps to (((L/K)/p), ((M/K)/p)). 


Proof. See Exercise 6.6, or, for a more general version of these facts, Mar- 
cus [77, Exercises 10-11, pp. 117-118]. Q.E.D. 


The second lemma tells us when a quadratic extension K C K(./a), a€ 
Z, is unramified: 
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Lemma 6.8. Let K be an imaginary quadratic field of discriminant dx, 

and let K (\/a) be a quadratic extension where a © Z. Then K C K (Va) 

is unramified if and only if a can be chosen so that a dx and a = 1 mod 

4. 

Proof. For the most part, the proof is a straightforward application of the 
techniques used in the proof of Proposition 5.31. See Exercises 6.7, 6.8 and 
6.9 for the details. Q.E.D. 


We can now prove (6.6). Let M* = K(,/p%,...,./p#). Since p} divid- 
es dx and satisfies pt = 1mod4, K c K(,/p?) is unramified by Lemma 
6.8, and consequently K Cc M* is unramified by Lemma 6.7. But M* = 
Q( Vax, / Pi,--+»\/P}) is clearly Abelian over Q, so that M* Cc M by the 
maximality of M. 

To prove the opposite inclusion, we first study Gal(M/Q). Since QC 
M c L corresponds to G > C2 > {1} under the Galois correspondence, we 
have 

Gal(M /Q) ~ Gal(L/Q)/Gal(L/M) = G/C(Ox)’, 


so that by (6.5), Gal(M/Q) ~ (Z/2Z)” for some m. Then Galois theory 
shows that M = Q(,/aj,...,,/@m) where @j,...,am € Z (see Exercise 6.10). 
Thus M is the compositum of quadratic extensions K C K(,/ai), a; € Z, 
and by Lemma 6.7, each of these is unramified. 

It thus suffices to show that M* contains all unramified extensions K C 
K(/a), a€Z. By Lemma 6.8, we may assume that a = 1 mod 4 and that 
a|dx. It follows that a must be of the form Die Pies lS <--- <is Sr, 
so that K(./a) is clearly contained in M*. This completes the proof of 
(6.6). 

We will next show that [M:Q] = 2/. Note that M = Q(/dx, J Dies 
/p}). When dx = 1 mod 4, we have dx = p}::: p}, so that [M: Q] = 2” = 
24 since =r in this case. When dx =0 mod 4, we can write dx = —4n, 
n> 0, and then we have 


Qi, J Ph VPP) n=1mod 4 
(6.9) M = § Q(V2, D7... VDP n=6 mod 8 
Q(V=2, \/PisVPF), n=2mod8 


(see Exercise 6.11). Thus [M : Q] = 2’*! = 2#. Since [C(Ox): C(Ox)*] equals 
half of [G: C(Ox)*] = [M: Q] = 2, we have proved that 


(6.10) [C(Ox): C(Ox)*] = 2471. 


We can now compute the Artin map ((M/K)/-): Ix — Gal(M/K). If 
we set K; = K(,/p7), then M is the compositum K,---K,, and we have a 
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(6.11) Gal(M/K) —+ |] Gal(K;/K). 
i=1 
Furthermore, we may identify Gal(K;/K) with {+1}, so that composing the 
Artin map with (6.11) gives us a homomorphism 
Ox tlk — {+1}’. 


We claim that if a is an ideal of Ox prime to 2dx, then ®x(a) can be 
computed in terms of Legendre symbols as follows: 


6.12) ox(a) = ((“2)....,(72)), 


where N(a) = |Ox/a| is the norm of a. 

To prove (6.12), we will need one basic fact about norms: if a and b are 
ideals of Ox, then N(ab) = N(a)N(b) (see Lemma 7.14 or Marcus [77, 
Theorem 22]). It follows that both sides of (6.12) are multiplicative in a, 
so that we may assume that a is a prime ideal p of Ox. Then Lemma 6.7, 
applied to (6.11), shows that ((M/K)/p) maps to the r-tuple 


p geer9 p e 
If we can show that 


(6.13) (AE) pr = (Ae) vor. 


then (6.12) will follow immediately. 
To prove (6.13), let $8 be a prime of Ox, containing p, and set 0 = 
((Ki/K)/p). By Lemma 5.19 we see that 


(6.14) go (/P?) = (Vr) = (pry\X@)-D/2,/p* mod f. 


Since K is a quadratic field, it follows that N(p) = p or p” (see Exercise 
5.11), and thus here are two cases to consider. 
If N(p) = p, then we know that 


(Beem (2) mod p. 


Since p € $ and (p7/p)=(p/pi) by quadratic reciprocity, we see that 
(6.14) reduces to 


(VP!) = (2) vor = (~2) pF moa 9, 


Pi 
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and we are done. If N(p) = p?, then by Fermat’s Little Theorem, 
2 p-1 
(pty? 0? = ((ppye*D/?)" = 1 mod p, 


so that (6.14) becomes 


o (Jor) = Ver = (~22) pF moa 9, 


and (6.13) is proved. This proves (6.12). 

For the rest of the proof, we will assume that dx = —4n, n> 0 (see 
Exercise 6.12 for the case dx =1 mod 4). Here, it is easily checked that 
the map (6.11) is an isomorphism, and then Artin reciprocity (Corollary 
5.24) for K C M means that the map ®x : Ix — {+1}’ of (6.12) induces an 
isomorphism 

A: C(Ox)/C(OxyY —> {41}, 


where the A stands for Artin. 

It’s now time to bring in quadratic forms. Let C(dx) be the class group 
of primitive positive definite forms of discriminant dx = —4n, and let P be 
the principal genus. Recal] from the proof of Theorem 3.15 that we have the 
js =r +1 assigned characters yo,¥1,..-,¥,, where xo is one of 6, € or de, 
and y;(a) = (a/p;) fori = 1,...,r. In Lemma 3.20, we proved that if f(x,y) 
represents a number a prime to 4n, then the genus of f(x,y) is determined 
by the (r + 1)-tuple (yo(@), x1(@),..-,X,(@)). Thus we have an injective map 


G:C(dx)/P — {41}"7}, 


where the G stands for Gauss. 
To relate the two maps A and G, we will use the isomorphism C(dx) ~ 
C(Ox) of Theorem 5.30. Since C(dx)* C P, we get the following diagram: 


C(dx)/ C(dx)? ——*> C(dx)/ P —2 > {#1}"*1 


(6.15) ) = 


C(Ox)/C(Ox)? ———____—_> {41} 
where a: C(dx)/C(dx)* > C(dx)/P is the natural surjection and 7 is the 
projection onto the last r factors. 

We claim that this diagram commutes, which means that Gauss’s defini- 
tion of genus is amazingly close to the Artin map of the genus field. (The 
full story of the relation is worked out in Exercise 6.13.) 

To prove that (6.15) commutes, let f(x,y) = ax? + 2bxy + cy? be a form 
of discriminant —4n. We can assume that a is relatively prime to 4n. Then, 
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in (6.15), if we first go across and then down, we see that the class of f(x,y) 
maps to 


(6.16) (x1(@),.--5Xr(a)) = ((=).-()). 


Let’s see what happens when we go the other way. By Theorem 5.30, f(x, y) 
corresponds to the ideal a = [a,b + /—n] of Ox. However, it is easy to see 
that the natural map 


(6.17) Z/al — Ox/a 


is an isomorphism (see Exercise 6.14). Thus a has norm N(a) = a, and our 
description of the Artin map from (6.12) shows that f(x,y) maps to 
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Comparing (6.16) and (6.18), we see that (6.15) commutes as claimed. 
Now everything is easy to prove. If we go down and across in (6.15), the 
resulting map is injective. By commutivity, it follows that a : C(dx)/C(dxy 
— C(dx)/P must be injective, which proves that C(dx)* = P, and part 
(iv) of the theorem is done. The number of genera is thus [C(dx): P] = 
[C(dx): C(dxy*] = [C(Ox): C(Ox)*] = 24-1 (the last equality is (6.10)), 
and (iii) follows. Finally, since P = C(dx)* corresponds to C(Ox)*, we see 
that M is the genus field of K, and then (i) and (ii) follow from (6.2) and 
(6.6). Theorem 6.1 is proved. Q.E.D. 


B. Applications to the Hilbert Class Field 


Theorem 6.1 makes it easy to compute the genus field. So let’s see if the 
genus field can help us find the Hilbert class field, which in general is more 
difficult to compute. The nicest case is when the genus field equals the 
Hilbert class field, which happens for field discriminants where every genus 
consists of a single class (see Exercise 6.15). In particular, if dx = —4n, 
then this means that n is one of Euler’s convenient numbers (see Proposi- 
tion 3.24). Of the 65 convenient numbers on Gauss’ list in §3, 35 satisfy the 
additional condition that dx = —4n (see Exercise 6.15), so that we can de- 
termine lots of Hilbert class fields. For example, when K = Q(./—5), Theo- 
rem 6.1 tells us that the Hilbert class field is K (5) = K(i). Other examples 
are just as easy to work out (see Exercise 6.16). 

The more typical situation is when the Hilbert class field is strictly big- 
ger than the genus field. It turns out that the genus field can stil] provide us 
with useful information about the Hilbert class field. Let’s consider the case 
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K = Q(V—14). Here, the genus field is M = K(./—7) = K(V2) by Theo- 
rem 6.1. Since the class number is 4, we know that the Hilbert class field is 
a quadratic extension of M, so that L = M(/u) for some u € M. This is 
already useful information, but we can do better. In Theorem 5.1, we saw 
the importance of a real primitive element of the Hilbert class field. So let’s 
intersect everything with the real numbers. This gives us the quadratic ex- 
tension MARC LNR. Since M = K(V2) = Q(V—14, V2), it follows that 
MR = Q(v2). Thus we can write LOR = Q(V2,,/u), where u > 0 is in 
Q(V2), and from this it is easy to prove that 


L=K(J/u),, u=at+bV2>0, a,beZz 


(see Exercise 6.17). Hence genus theory explains the form of the primitive 


element a = V2V2-1 of Proposition 5.31. In Exercise 6.18, we will con- 
tinue this discussion and show how one can take u = a + bv2 and discover 
the precise form u = 2/2 —1 of the primitive element of the Hilbert class 
field. 

It’s interesting to compare this discussion of x? + 14y? to what we did in 
§3. The genus theory developed in §3 told us when p was represented by 
x? + 14y” or 2x? + 7y”, but this partial information didn’t help in deciding 
when p = x”? + 14y?. In contrast, the genus theory of Theorem 6.1 deter- 
mines the genus field, which helps us understand the Hilbert class field. The 
field-theoretic approach seems to have more useful] information. 

This ends our discussion of genus theory, but it by no means exhausts 
the topic. For more complete treatments of genus theory from the point 
of view of class field theory, see Hasse [51], Janusz [62, §VI.3] and Cohn’s 
two books [19, Chapters 14 and 18] and [21, Chapter 8]. Genus theory can 
also be studied by standard methods of algebraic number theory, with no 
reference to class field theory. Both Cohn [20, Chapter XIII] and Hasse 
[50, §§26.8 and 29.3] use the Hilbert symbol] in their discussion of genera. 
For a more elementary approach, see Zagier [111]. Genus theory can also 
be generalized in several] ways. It is possible to define the genus field of 
an arbitrary number field (see Ishida [60]), and in another direction, one 
can formulate genus theory from the point of view of algebraic groups and 
Tamagawa numbers (Ono [82] has a nice introduction to this subject). For 
a survey of all these aspects of genus theory, see Frei [39]. 


C. Exercises 


6.1. Let dx be the discriminant of a quadratic field. When considering 
forms of discriminant dx, show that the number p from Proposition 
3.11 is just the number of primes dividing dx. 


6.2. 


6.3. 
6.4. 


6.5. 
6.6. 


6.7. 


6.8. 
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Suppose that we have fields K C M C L, where L is Galois over K 
with group G = Gal(L/K). Prove that M is Abelian over K if and 
only if [G,G] c Gal(L/M). 


Prove statement (6.4). 


If K is an imaginary quadratic field and M is an unramified Abelian 
extension of K, then prove that M is Galois over Q. Hint: use the 
description of Gal(L/Q), where L is the Hilbert class field of K. 


Prove statement (6.5). 


In this problem we will prove Lemma 6.7. Let p be a prime of Ox. 
(a) If p is unramified in LM, then use Exercise 5.15 to show that it’s 
unramified in both L and M. 


(b) Prove the converse of (a). Hint: assume not. Then use the facts 
about the decomposition group from Proposition 5.10 to find 0 € 
Gal(LM/K), o #1, such that o(@) =a mod § for all a € Oxy 
(and §B is a prime of O,y containing p). Argue that oj, and oy 
are the identity. Note that (a) and (b) prove part (i) of Lemma 
6.7. 

(c) Use Exercise 5.16 to prove part (ii) of Lemma 6.7. 

(d) With the same hypothesis as Lemma 6.7, show that p splits com- 
pletely in LM if and only if it splits completely in both L and M. 
In Exercise 8.14 we will see that this result can be proved without 
assuming that L and M are Galois over K. 


Let K = Q(i, 2m), where m € Z is odd and squarefree. 


(a) Let a=(1+i)V2m/2. Show that a* =im, and conclude that 
a € Ox. (It turns out that 1,7, /2m and a form an integral basis 
of Ox—see Marcus [77, Exercise 42 to Chapter 2].) 


(b) Let $8 be the ideal of Ox generated by 1+ 7 and 1+ a@. Show that 
20x = $4, and conclude that 8 is prime. Hint: compute $?. 


Let K be an imaginary quadratic field. We want to show that if K C 
K (i) is unramified, then d, = 12 mod 16. 


(a) Show that K C K(1) is ramified when dx =1 mod 4. Hint: con- 
sider the diagram of fields 


fy 
ee 


K(?) 


Q) 
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If K C K(i) is unramified, show that 2 is unramified in K(z). But 
2 ramifies in Q(z). Exercise 5.15 will be useful. 


(b) Show that the extension is ramified when dx = 0 mod 8. Hint: if 
it’s unramified, show that the ramification index of 2 in K (i) is at 
most 2. Then use Exercise 6.7. 


Since an even discriminant is of the form 4N, where N = 2,3 mod 4, 
it follows from (a) and (b) that dx =12 mod 16 when K C K‘(i) is 
unramified. 


6.9. In this exercise we will prove Lemma 6.8. 


(a) Prove that K Cc K(/a) is unramified when a|dx and a = 1 mod 
4. Hint: when 2 ¢ p, note that dx = ab, where K(./a) = K(Vb). 


(b) Assume that K C K(./a) is unramified. Show that a | dx and con- 
sequently that a may be chosen to be odd. Hint: if p is a prime 
such that p|a, p/dx, then analyze p in the fields 


K(Va) 
\ 


Q{v4) K 


Q 
(c) Let K C K(\/a@) be unramified, where a | dx is odd. 


(i) If a=3 mod 4, show that dx = 12 mod 16. Hint: apply (a) to 
~—a, and then use Exercise 6.8. 
(ii) If dx = 12 mod 16, show that K(./a) = K(Vb), where b| dx 
and b= 1 mod 4. Hint: factor dx. 
Lemma 6.8 follows easily from (a}-(c). 


6.10. If M is a Galois extension of Q and Gal(M/Q) ~ (Z/2Z)”, then 
show that M = Q(,/41,...,,/@m), ai € Z squarefree. 


6.11. Prove the description of the genus field M given in (6.9). 
6.12. Complete the proof of Theorem 6.1 when dx = 1 mod 4, dx <0. 


6.13. Let K be an imaginary quadratic field of discriminant —4n. The 
description of the genus field M given in (6.9) gives us an isomor- 
phism 

Gal(M /Q) — {+1}?. 


However, we also have maps 


C(—4n) —+ C(Ox) — Gal(M/K). 


6.14. 
6.15. 


6.16. 


6.17. 


6.18. 
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If we combine these with the natural inclusion Gal(M/K) Cc Gal(M/ 
Q), then we get a map 


C(-4n) — {+1}. 


Show that this map is exactly what Gauss used in his definition of 
genus. Hint: it’s fun to see the characters €, 6 and €6 from §3 reap- 
pear. For example, when n is odd, the key step is to show that 


(MA) = seca 


for ideals a prime to 4n. The proof is similar to the proof of (6.13). 
Prove that the map (6.17) is an isomorphism. 


In this exercise we wil] study when the genus field equals the Hilbert 

class field. 

(a) Prove that the genus field of an imaginary quadratic field K 
equals its Hilbert class field if and only if for primitive positive 
forms of discriminant dx, there is only one class per genus. 

(b) Of Gauss’ list of 65 convenient numbers 7 in §3, which satisfy 
the condition (5.2) that guarantees that —4n is a field discrim- 
inant? This gives us a list of fields where we know the Hilbert 
class field. 


Compute the Hilbert class fields of the fields Q(./—6), Q(./—10) 
and Q(V-—35). 


Let K = Q(V—14), and let L be the Hilbert class field of K. The 
genus field M of K is K(/—7) = K(v2), so that L is a degree 2 
extension of M. Use the hints in the text to show that L = K(/u), 
where u=a+bV/2>0,a,beZ. 


In this exercise we will discover a primitive element for the Hilbert 

class field L of K = Q(/—14). From the previous exercise, we know 

that L = K(/u), where u=a+b/2>0,a,beZ. Let u' =a—bv2. 

(a) Show that Gal(L/Q) is the dihedral group (0,7 :04 =1, 77 = 
1, ot = 70°) of order 8, where o(./u) = Vu! and 7 is complex 
conjugation. Conclude that 0?(,/u) = —/u and 7(Vu') = —Ju'. 

(b) Show that Q(./—7) is the fixed field of o? and oT. 

(c) Show that /uu’ is fixed by o* and o7, and then using 7, con- 
clude that /uu' = m/-7, me Z. 

(d) Let N be the norm function on Q(V2), and let 7 = 2V2-1. 
Note that N(a)=—7. Show that u =a, where N(a@) = m?. 
Hint: Z[V2] is a UFD. You may have to switch u and w’. 
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(e) Assume that wu has no square factors in Z[/2]. Then show that 
u = enn, where € is a unit and n is a squarefree integer prime to 
14. Hint: use Proposition 5.16 to describe the primes in Z[V2]. 

(f) Show that n must be +1. Hint: note that vO, = 7O,-nOz isa 
square, and conclude that any prime dividing n ramifies in L. 

(g) Thus u = em by (f). All units of Z[V/2] are of the form +(/2- 
1)” (see Hasse [50, pp. 554-556]). Since N(u) = —7 and 
N(V2— 1) = —1, we can assume u = 7 since u> 0. 

This proves that /u = ,/m = V2\/2—1 is the desired primitive ele- 

ment. 


6.19. Adapt Exercises 6.17 and 6.18 to discover a primitive element for 
the Hilbert class field of Q(./—17). Hint: see Exercise 5.25. You may 
assume that the integers in Q(/17) are a UFD and that the units are 
all of the form +(4+ 17)", m € Z (see Borevich and Shafarevich 
[8, p. 422]). This method will lead most naturally to u = 4+ V17, 
which is related to our earlier choice (1 + /17)/2 via 


(44 V17)-(1+ V17)/2 = (5+ V17)/2). 


This problem may also be done without using the fact that Q(V/17) 
has class number 1 (see Herz [56]). 


6.20. Let K = Q(/—55). 
(a) Show that C(Ox) ~ Z/4Z. 
(b) Determine the Hilbert class field of K. Hint: use the methods 
of Proposition 5.31. Exercises 6.17 and 6.18 will show you what 
to look for. 


(c) Prove an analog of Theorem 5.33 for primes of the form x? + 
55y?. 
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In §5, we solved our basic question of p = x? + ny? for those n’s where 
Z[/—n] is the full ring of integers Ox in K = Q(,/—n) (see (5.15)). This 
holds for infinitely many n’s, but it also leaves out infinitely many. The full 
story of what happens for these other n’s will be told in §9, and we will see 
that the answer involves the ring Z[,/—n]. Such a ring is an example of an 
order in an imaginary quadratic field, which brings us to the main topic of 
87. 

We begin this section with a study of orders in a quadratic field K. 
Unlike Ox, an order © is usually not a Dedekind domain, so that the ideal 
theory of O is more complicated. This will lead us to restrict the class 
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of ideals under consideration. In the case of imaginary quadratic fields, 
there is a nice relation between ideals in orders and quadratic forms. In 
particular, an order O has an ideal class group C(O), and we will show that 
for any discriminant D < 0, the form class group C(D) from §3 is naturally 
isomorphic to C(O) for a suitable order O. Then, to prepare the way for 
class field theory, we will show how to translate ideals for an order O in 
K into terms of the maximal order Ox. The section will conclude with a 
discussion of class numbers. 


A. Orders in Quadratic Fields 


An order O in a quadratic field K is a subset O C K such that 

(i) O is a subring of K containing 1. 

(ii) O is a finitely generated Z-module. 
(iii) O contains a Q-basis of K. 
Since © is clearly torsion-free, (ii) and (iii) are equivalent to O being a free 
Z-module of rank 2 (see Exercise 7.1). Note also that by (iii), K is the field 
of fractions of O. 

The ring Ox of integers in K is always an order in K—this follows from 
the description (5.13) of Ox given in §5. More importantly, (i) and (ii) 
above imply that for amy order O of K, we have OC Ox (see Exercise 
7.2), so that Ox is the maximal order of K . 

To describe orders in quadratic fields more explicitly, first note that by 
(5.14), the maximal order Ox can be written as follows: 


(7.1) Ox=[1,wk], w= ox + Vex ~ 


where dx is the discriminant of K. We can now describe all orders in qua- 
dratic fields: 


Lemma 7.2. Let O be an order in a quadratic field K of discriminant dx. 
Then O has finite index in Ox, and if we set f =[Ox: O], then 


O=2+f0Ox =([1,f wx], 


where Wx is as in (7.1). 


Proof. Since O and Ox are free Z-modules of rank 2, it follows that 
[Ox : O] < oo. Setting f = [Ox: O], we have fOx C O, and then Z + fOx Cc 
O follows. However, (7.1) implies Z + fOx =[1,f wx], so that to prove the 
lemma, we need only show that [1, fwx] has index f in Ox = [1,wx]. This 
last fact is obvious, and we are done. Q.E.D. 
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Given an order © in a quadratic field K , the index f = [Ox: O] is called 
the conductor of the order. Another important invariant of O is its discrim- 
inant, which is defined as follows. Let a+ a’ be the nontrivial automor- 
phism of K, and suppose that O = [a,8]. Then the discriminant of O is 


the number 3 
a p 
D= (cet ( 7) ; 
a 


The discriminant is independent of the integral basis used, and if we com- 
pute D using the basis O = [1,fwx] from Lemma 7.2, then we obtain the 
formula 


(7.3) D = f2dk. 


Thus the discriminant satisfies D = 0,1 mod 4. From (7.3) we also see that 
K =Q(vVD), so that K is real or imaginary according to whether D > 0 
or D <0. In fact, one can show that D determines © uniquely and that 
any nonsquare integer D = 0,1 mod 4 is the discriminant of an order in a 
quadratic field. See Exercise 7.3 for proofs of these elementary facts. Note 
that by (7.3), the discriminant of the maximal order Ox is dx, which agrees 
with the definition given in §5. 

For an example of an order, consider Z[,/—n] C K = Q(./—n). The dis- 
criminant of Z[V—n] is easily computed to be —4n, and then (7.3) shows 
that 

—4n = f7dkx. 


This makes it easy to compute the conductor of Z[.,/—n]. This order will 
be used in §9 when we give the general solution of p = x? + ny?. 

Now let’s study the ideals of an order O. If a is a nonzero ideal of O, 
then the proof of Corollary 5.4 adapts easily to show that O/a is finite 
(see Exercise 7.4). Thus we can define the norm of a to be N(a) = |O/a|. 
Furthermore, as in the proof of Theorem 5.5, it follows that O is Noetherian 
and that every nonzero prime ideal of O is maximal (see Exercise 7.4). 
However, it is equally obvious that if the conductor f of O is greater than 
1, then © is not integrally closed in K , so that O is not a Dedekind domain 
when f >1. Thus we may not assume that the ideals of O have unique 
factorization. 

To remedy this situation, we will introduce the concept of a proper ideal 
of an order. Namely, given any ideal a of O, notice that 


Oc{BEK:Baca} 


since a is an ideal of O. However, equality need not occur. For example, 
if O = Z[V—3] is the order of conductor 2 in K = Q(/-3), and a is the 
ideal of O generated by 2 and 1+ /—3, then one sees easily that 


OA{BEK:Baca}=Ox 
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(see Exercise 7.5). In general, we say that an ideal a of O is proper when- 
ever equality holds, i.e., when 


O={BEK:Baca}. 


For example, principal ideals are always proper, and for the maximal order, 
all ideals are proper (see Exercise 7.6). 

We can also extend this terminology to fractional ideals. A fractional 
ideal of O is a subset of K which is a nonzero finitely generated O-module. 
One can show that every fractional ideal is of the form aa, where a € K* 
and a is an O-ideal (see Exercise 7.7). Then a fractional O-ideal b is proper 


provided that 
O={BEK:Bbc Bb}. 


Once we have fractional ideals, we can also talk about invertible ideals: 
a fractional O-ideal a is invertible if there is another fractional O-ideal 6 
such that ab = ©. Note that principal fractional ideals (those of the form 
aQ, a € K*) are obviously invertible. The basic result is that for orders in 
quadratic fields, the notions of proper and invertible coincide: 


Proposition 7.4. Let O be an order in a quadratic field K, and let a be a 
fractional O-ideal. Then a is proper if and only if a is invertible. 


Proof. If a is invertible, then ab = O for some fractional O-ideal b. If Be 
K and Bac a, then we have 


BO = B(ab) = (Ga)b Cc ab =O, 


and 8 € O follows, proving that a is proper. 
To argue the other way, we will need the following lemma: 


Lemma 7.5. Let K = Q(r) be a quadratic field, and let ax* + bx + c be the 
minimal polynomial of T, where a, b and c are relatively prime integers. Then 
[1,7] is a proper fractional ideal for the order (1, at] of K. 


Proof. First, [1,a7] is an order since a7 is an algebraic integer. Then, given 
6B €K, note that 6[1,7] C [1,7] is equivalent to 

p -1€ [1, T] 

8-7 e€[1,7]. 


The first line says G6 = m+n7T, m,ne€Z. To understand the second, note 
that 


n 
Br = mr + nr’ = mr + —(—br —c) 


—cn —bn 
=. (= + m) T. 
a a 
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Since gcd(a, b,c) = 1, we see that 67 € [1,7] if and only if a|n. It follows 
that 


{6 € K : [1,7] c [1,7]} = [1,47], 
which proves the lemma. Q.E.D. 


Now we can prove that proper fractional ideals are invertible. First 
note that a is a Z-module of rank 2 (see Exercise 7.8), so that a = [a,B] 
for some a,B € K. Then a = al[l,z7], where t = B/a. If ax? + bx +c, 
gcd(a,b,c,) = 1, is the minimal polynomial of 7, then Lemma 7.5 implies 
that O = [l,a7]. Let B > BP’ denote the nontrivial automorphism of K. 
Since 7’ is the other root of ax? + bx + c, using Lemma 7.5 again shows 
thata’ = a’[1,7’] is a fractional ideal for [1,aT] = [l,a7v’] = O. We claim 
that 


(7.6) aa’ = NO 6, 
To see why, note that 
aaa’ = aaa'[1,T][1,7'] = N(a@)[a,at,at',atT'}. 
Since 7 + 7' = —b/a and TT' = c/a, this becomes 
aaa’ = N(a)[a,at,—b,c] = N(a)[1,aT] = N(a)O 


since gcd(a,b,c) = 1. This implies (7.6), which in turn proves that a is in- 
vertible. Q.E.D. 


Unfortunately, Proposition 7.4 is not strong enough to prove unique fac- 
torization for proper ideals (see Exercise 7.9 for a counterexample). Later 
we will see that unique factorization holds for a slightly smaller class of 
ideals, those prime to the conductor. 

Given an order O, let J(O) denote the set of proper fractional O-ideals. 
By Proposition 7.4, J(O) is a group under multiplication: the crucial issues 
are closure and the existence of inverses, both of which follow from the 
invertibility of proper ideals (see Exercise 7.10). The principal O-ideals give 
a subgroup P(Q) c I(Q), and thus we can form the quotient 


C(O) = 1(0)/P(O), 


which is the ideal class group of the order O. When O is the maximal order 
Ox, I(Ox) and P(Ox) will be denoted Ix and Px, respectively. This is 
the notation used in §5, and in general we will reserve the subscript K 
exclusively for the maximal order. Note that the above definition of C(Ox) 
agrees with the one given in §5. 
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B. Orders and Quadratic Forms 


We can relate the ideal class group C(Q) to the form class group C(D) 
defined in §3 as follows: 


Theorem 7.7. Let O be the order of discriminant D in an imaginary qua- 
dratic field K. 
(i) If f(x,y) = ax* + bxy +cy? is a primitive positive definite quadratic 

form of discriminant D, then [a,(—b + VD)/2] is a proper ideal of O. 

(ii) The map sending f(x,y) to [a,(—b + V/D)/2] induces an isomorphism 
between the form class group C(D) and the ideal class group C(O). 
Hence the order of C(O) is the class number h(D). 

(iii) A positive integer m is represented by a form f(x,y) if and only if m is 
the norm N(a) of some ideal a in the corresponding ideal class in C(O) 
(recall that N(a) = |O/a\). 


Remark. Because of the isomorphism C(D)~ C(O), we will sometimes 
write the class number as h(©) instead of h(D). 


Proof. Let f(x,y) = ax” + bxy + cy? be a primitive positive definite form 
of discriminant D < 0. The roots of f(x,1) = ax? + bx +c are complex, so 
that there is a unique 7 € h (6 is the upper half plane) such that f(7,1) = 
0. We call 7 the root of f(x,y). Since a>0, it follows that 7 = (—b+ 
VD)/2a. Thus 


[a,(—b + VD)/2] = [a,aT] = a[1,7]. 


Note also that TE K. 

To prove (i), note that by Lemma 7.5, a[1,7] is a proper ideal for the 
order [1,aT]. However, if f is the conductor of O, then D = f7dx by (7.3), 
and thus 

—-b+J/D —-b+fVJdx 
pe a eee 


b d b 
= PLR yp (BAR) - Os Fg 


Since D = b* — 4ac, fdx and b have the same parity, so that (b+ fdx)/2 
EZ. It follows that [1,a7T] =[1,fwx], so that [1,atT] =O by Lemma 7.2. 
This proves that a[1,7] is a proper O-ideal. 
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To prove (ii), let f(x,y) and g(x,y) be forms of discriminant D, and let 
T and 7’ be their respective roots. We will prove the following equivalences: 


f(x,y),g(%,y) are properly equivalent 


,_PT+q (P 4 
(7.8) ao Dt? *) €SL@,2) 


<=> [1,7] = [1,7], \ € K*. 


To see why this is true, assume that f(x,y) =g(px + qy,rx+sy), where 
(? 4) € SL(2,Z). Then 


pt+q 1) 


(7.9) 0=f(r1) =e (prt grr +s) = (rr+ ope (PERE, 


so that g((p7T + q)/(rT +s), 1) = 0. However, an easy computation shows 
that 


(7.10) tm( 2 = 1) = det & -) Ir7 +s|~7Im(r) 
rT +S P. <§ 


(see Exercise 7.11). This implies (pt + q)/(r7 +5) € 6, and thus 7! = (pt + 
q)/(rT +s) by the uniqueness of the root 7’. Conversely, if 7! = (pT + 
q)/(rt +s), then (7.9) shows that f(x,y) and g(px + qy,rx +sy) have the 
same root, and it follows easily that they must be equal (see Exercise 7.12). 
This proves the first equivalence of (7.8). 

Next, if 7’ = (pt + q)/(r7 +s), letAX =r7+5 €K*. Then 


nm ptt+q 
A[1,7'] = (rT +5) 1, med 


=[r7+5s,p7+q]=[1,T] 
since (74) € SL(2,Z). Conversely, if [1,7] = A[1,7'] for some 4 € K*, then 
[1,7] = [A,AT‘’], which implies 
At’ = pT +q 
A=rT+S 
for some (? 4) € GL(2,Z). This gives us 


ges PP +q 
rT+s° 
and then (7.10) shows that (?7) € SL(2,Z) since 7 and 7’ are both in bh. 
This completes the proof of (7.8). 
Using (7.8), one easily sees that the map sending f (x, y) to a[1,7] induces 
an injection 
C(D) — C(O). 
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To show that the map is surjective, let a be a fractional O-ideal. As in 
the proof of Proposition 7.4, we can write a=[a,G] for some a,BEK. 
Switching a and # if necessary, we can assume that T = 8/a lies in b. 
Let ax* +bx+c be the minimal polynomial of 7. We may assume that 
gcd(a,b,c) = 1 and a>0. Then f(x,y) = ax* + bxy + cy? is positive def- 
inite of discriminant D (see Exercise 7.12), and f(x,y) maps to a[1,7]. 
This ideal lies in the class of a = [a,] = a[1,7] in C(O), and surjectivity is 


proved. 
We thus have a bijection of sets 
(7.11) C(D) — C(O). 


We next want to see what happens to the group structure, but we first need 
to review the formulas for Dirichlet composition from §3. Given two prim- 
itive positive definite forms f(x,y) = ax? + bxy + cy? and g(x,y) =a'x* + 
b'xy +c'y? of discriminant D, suppose that gcd(a,a’,(b + b’)/2) = 1. Then 
the Dirichlet composition of f(x,y) and g(x,y) was defined to be the form 


B*-—D 
4aa' 


F(x,y) = aa'x* + Bxy + y’, 
where B is the unique number modulo 2aa’ such that 
B = b mod 2a 
(7.12) B = D' mod 2a’ 
B? = D mod 4aa! 


(see Lemma 3.2 and (3.7)). In Theorem 3.9 we asserted that Dirichlet com- 
position made C(D) into an Abelian group, but the proof given in §3 was 
not complete. So our first task is to use the bijection (7.11) to finish the 
proof of Theorem 3.9. 

Given f(x,y), g(x,y) and F(x,y) as above, we get three proper ideals 
of O: 


[a,(—b + f Vdx)/2], [a’,(—b' + f /dx)/2] and [aa',(—B + f /dx)/2]. 


If we set A = (—B+fJ/dx)/2 and use the top two lines of (7.12), then 
these ideals can be written as 


[a, A], [a’, A] and [aa', A]. 
We claim that 
(7.13) [a, A][a’, A] = [aa’, A]. 
To see this, note that A? = —BA mod aa’ by the last line of (7.12). Thus 
[a,A][a’, A] = [aa’,aA,a’A, A?] = [aa’,aA,a'A,—BA\. 
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However, from (7.12) one easily proves that gcd(a,a’,B) = 1 (see Exercise 
7.13), and then (7.13) follows immediately. 

By (7.11) and (7.13), we see that the Dirichlet composition of f(x,y) 
and g(x,y) corresponds to the product of their corresponding ideal classes, 
which proves that Dirichlet composition induces a well-defined binary op- 
eration on C(D). Furthermore, since the product of ideals makes C(O) into 
a group, it follows immediately that C(D) is a group under Dirichlet com- 
position. This completes the proof of Theorem 3.9, and it is now obvious 
that (7.11) is an isomorphism of groups. 

Before we can prove part (iii) of the theorem, we need to learn more 
about the norm N(a) = |O/a| of a proper O-ideal a. The basic properties 
of N(a) are: 


Lemma 7.14, Let O be an order in an imaginary quadratic field. Then: 
(i) N(@O) = N(a) forae O, a #0. 

(ii) N(ab) = N(a)N(b) for proper O-ideals a and b. 

(iii) aad = N(a)O for a proper O-ideal a. 


Proof. The proof of (i) is covered in Exercises 7.14 and 7.15. We will next 
prove a special case of (ii): if a # 0 in O, we claim that 


(7.15) N(aa) = N(a)N(a). 


To prove this, note that the inclusions aa C a0 C O give us the short exact 
sequence 
0-— aO/aa— O/aa— O/aO — 0, 


which implies that |O/aa| = |O/aO||aO/aa|. Since multiplication by @ in- 
duces an isomorphism O/a > aO/aa, we get N(aa) = N(aO)N(a), and 
then (7.15) follows from (i). 

Before proving (ii) and (iii), we need to study M(a). If we write a in 
the form a = afl,7], then Lemma 7.5 implies that O = [1,aT]. Since 
[a, aT] obviously has index a in[1,a7T], we obtain 


N(a[i,7]) = a. 
Then a- a = a-a[1,7] and (7.15) imply that 
(7.16) N(a)=~ = 


Now (iii) follows immediately by combining (7.16) with the equation 


aa = NOs 
a 
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proved in (7.6). Turning to (ii), note that (iii) implies that 
N(ab)O = ab-ab = ad- bb = N(a)O- N(b)O = N(a)N(b)O, 
and then N(ab) = N(a)N(b) follows. Q.E.D. 


A useful consequence of this lemma is that if a is a proper O-ideal, 
then a gives the inverse of a in C(O). This follows immediately from aa = 
N(a)O. In Exercise 7.16 we will use the isomorphism C(D) ~ C(O) to give 
a second proof of this fact. 

We can now prove part (iii) of the theorem. If m is represented by 
f(x,y), then m = d?a, where a is properly represented by f(x,y). We may 
assume that f(x,y) = ax? + bxy +cy*. Then f(x, y) maps to a = a[1,7], so 
that N(a) = a by (7.16). It follows that N(da) = d*a = m, so that m is the 
norm of an ideal in the class of a. 

Conversely, assume that N(a) =m. We know that a = a[1,7], where 
Im(r) > 0 and at? + bt +c =0, gcd(a,b,c) = 1 and a> 0. Then f(x,y) = 
ax? + bxy + cy” maps to the class of a, so that we need only show that 
f(x,y) represents m. 

By (7.16), we know that 


neaNG= also 


However, a[1,7] = ac O =[1,aT], so that a= p+qat and at =r+sat 
for some integers p,q,r,s € Z. Thus (p + gaT)T =r + sar, and since at* = 
—br —c, comparing coefficients shows that p = as + bq. Hence 


= - 1 


= <(p’ — bpq + acq’) 


=< ((as + bqy* — b(as + bq)q + acq 2) 


—_ 


= —(a’s” + absq + acq’) 
= as* + bsq +. cq? = f (s,q). 
This proves (iii) and completes the proof of Theorem 7.7. Q.E.D. 


Notice that Theorem 5.30 is an immediate corollary of Theorem 7.7. 

The map f(x,y) a = [a,(—b + VD)/2] of Theorem 7.7 has a natural 
inverse which is defined as follows. If a = [a,f] is a proper O-ideal with 
Im(G/a) > 0, then 


f(x,y) = a 


142 §7. ORDERS IN IMAGINARY QUADRATIC FIELDS 


is a positive definite form of discriminant D. On the level of classes, this 
map is the inverse to the map of Theorem 7.7 (see Exercise 7.17). 

Theorem 7.7 allows us to translate what we know about quadratic forms 
into facts about ideal classes. Here is an example that will be useful later 
on: 


Corollary 7.17. Let O be an order in an imaginary quadratic field. Given a 
nonzero integer M, then every ideal class in C(O) contains a proper O-ideal 
whose norm Is relatively prime to M. 


Proof. In Lemma 2.25 we learned that any primitive form represents num- 
bers relatively prime to M, and the corollary then follows from part (111) of 
Theorem 7.7. Q.E.D. 


The reader may wonder if Theorem 7.7 holds for real quadratic fields. 
Simple examples show that this isn’t true in general. For instance, when 
K = Q(V3), the maximal order Ox = Z[V3] is a UFD, which implies that 
C(Ox) = {1}. Yet the forms +(x? —3y?) of discriminant dx = 12 are not 
properly equivalent, so that C(dx) % {1} (see Exercise 7.18 for the details). 
In order to make a version of Theorem 7.7 that holds for real quadratic 
fields, we need to change the notion of equivalence. In Exercises 7.19-7.24 
we will explore two ways of doing this: 


1. Change the notion of equivalence of ideals. Instead of using all principal 
ideals P(O), use only P*(Q), which consists of all principal ideals a0 
where N(a@) > 0. The quotient [(O)/P*(©) is the narrow (or Strict) ideal 
class group and is denoted by C*(Q). In Exercise 7.21 we will construct 
a natural isomorphism C(D) ~ C*(Q) which holds for any order in any 
quadratic field K. We also have C*(©) = C(O) when K is imaginary, 
and the same is true when K 1s real and © has a unit € with N(e€) =—1. 
If K has no such unit, then |C*(O)| = 2/C(O)|. 

2. Change the notion of equivalence of forms. Instead of using proper 
equivalence, use the notion of signed equivalence, where f(x,y) and 
g(x,y) are signed equivalent if there is a matrix (27) € GL(2,Z) such 
that 


pg 
POGy) = det ( : ") eto + Gy, 1X + SY): 


The set of signed equivalence classes is denoted C,(D), and in Exercise 
7.22 we will see that there is a natural isomorphism C,(D) ~ C(O). The 
criteria for when C;(D) = C(D) are the same as above. 


For other treatments of the relation between forms and ideals, see Bore- 
vich and Shafarevich [8, Chapter 2, §7.5], Cohn [19, §§14.A-C] and Zagier 
[111, §88 and 10]. 
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C. Ideals Prime to the Conductor 


The theory described so far does not interact well with the usual formu- 
lation of class field theory. The reason is that class field theory is always 
stated in terms of the maximal order Ox. So given an order O in a qua- 
dratic field K , we will need to translate proper O-ideals into terms of Ox- 
ideals. This is difficult to do directly, but becomes much easier once we 
study O-ideals prime to the conductor. 

Given an order O of conductor f, we say that a nonzero O-ideal a is 
prime to f provided that a + fO = O. The following lemma gives the basic 
properties of O-ideals prime to the conductor: 


Lemma 7.18. Let O be an order of conductor f . 

(i) An O-ideal a is prime to f if and only if its norm N(a) is relatively prime 
to f. 

(ii) Every O-ideal prime to f is proper. 


Proof. To prove (i), let mg: O/a— O/a be multiplication by f . Then 
a+fO=O <> mf is surjective <> my is an isomorphism. 


By the structure theorem for finite Abelian groups, my is an isomorphism 
if and only if f is relatively prime to the order N(a) of O/a, and (i) is 
proved. 

To show that an O-ideal a prime to f is proper, let G € K satisfy Baca. 
Then f is certainly in Ox, and we thus have 


BO = B(at+fO)=Bat+BfOcatfOx. 


However, f Ox Cc O, which proves that GO Cc O. Thus f € O, which proves 
that a is proper. Q.E.D. 


It follows that O-ideals prime to f lie naturally in J(O) and are closed 
under multiplication (since N(ab) = N(a)N(6) will also be prime to f/f). 
The subgroup of fractional ideals they generate is denoted [(O,f) c I(O), 
and inside of [(O, f) we have the subgroup P(O,f) generated by the prin- 
cipal ideals a where a€ O has norm N(q@) prime to f. We can then 
describe C(O) in terms of J(O,f) and P(O,f) as follows: 


Proposition 7.19. The inclusion I(O, f) Cc I(O) induces an isomorphism 
I(O" f)/P(O,f) = 1(O)/P(O) = C(O). 


Proof. The map I(O,f) — C(O) is surjective by Corollary 7.17 (any ideal 
class in C(O) contains an O-ideal prime to f), and the kernel is [(O,f)N 
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P(O). This obviously contains P(O,f), but the inclusion [(O,f)N P(O) c 
P(O,f) needs proof. An element of [(0O,f)N P(Q) is a fractional ideal 
aO = ab-!, where a€ K and a and 6b are O-ideals prime to f. Let m= 
N(6). Then mO = N(b)O = bb, so that mb~! = b. Hence 


maO =a-mb-!= abc QO, 


which proves that maO € P(O,f). Then a0 = maO-(mO)~! is also in 
P(O,f), and the proposition is proved. Q.E.D. 


For any order ©, ideals prime to the conductor relate nicely to ideals 
for the maximal order Ox. Before we can explain this, we need a defini- 
tion: given a positive integer m, an Ox-ideal a is prime to m provided that 
a+ mOx = Ox. As in Lemma 7.18, this is equivalent to gcd(N(a), m) = 1. 
Thus, inside of the group of fractional Ox-ideals Ix, we have the subgroup 
Ix(m) C Ix generated by Ox-ideals prime to m. 


Proposition 7.20. Let O be an order of conductor f in an imaginary qua- 
dratic field K. 
(i) If a is an Ox-ideal prime to f , then aNO is an O-ideal prime to f of 

the same norm. 

(ii) Jf a is an O-ideal prime to f , then aOx is an Ox-ideal prime to f of 
the same norm. 

(iii) The map a> aNO induces an isomorphism Ix(f) > I(O,f), and the 
inverse of this map is given by ar aOx. 


Proof. To prove (i), let a be an Ox-ideal prime to f. Since O/aN OC injects 
into Ox/a and N(a) is prime to f , so is N(aN QO), which proves that aN O 
is prime to f. As for norms, consider the natural map 


O/anO — Ox/a. 


It is injective, and since a is prime to f, multiplication by f induces an 
isomorphism of Ox/a. But fOx C O, and surjectivity follows. This shows 
that the norms are equal, and (i) is proved. 

To prove (ii), let a be an O-ideal prime to f . Since 


aOx + fOx = (a+ fO)Ox = OOx = Ox 


we see that aOx is also prime to f. The statement about norms will be 
proved below. 
Turning to (iii), we claim that 


(7:21) aOxnO=a when a is an Q-ideal prime to f 
, (aNO)Ox=a when a is an Ox-ideal prime to f. 
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We start with the top line. If a is an O-ideal prime to f, then 
aOxnNO =(aOx«NO)/)O 
= (aOxNnO)(a+t fO) 
Cat+f(aOxnO)cata-:fOx. 


Since fOx C O, this proves that aAOx NOC a. The other inclusion is obvi- 
ous, so that equality follows. Turning to the second line of (7.21), let a be 
an Ox-ideal prime to f . Then 


a=a0O=a(anO+fO)c(anO)Ox + fa. 


However, fac fOx CO, so that fac anOc(anO)Ox, and ac(an 
O)Ox follows. The other inclusion is obvious, which finishes the proof of 
(7.21). Notice that (7.21) and (i) imply the norm statement of (ii). 

From (7.21) we get a bijection on the monoids of Ox- and O-ideals 
prime to f. If we can show that ar» aNO preserves multiplication, then 
we get an isomorphism Ix(f) ~ I(O,f) (see Exercise 7.25). But multiplica- 
tivity is easy, for the inverse map ar aQx is obviously multiplicative: 


(ab)Ox = aOx : bOk. 
This proves the proposition. Q.E.D. 


Using this proposition, it follows that every O-ideal prime to f has a 
unique decomposition as a product of prime O-ideals which are prime to f 
(see Exercise 7.26). 

We can now describe C(Q) in terms of the maximal order: 


Proposition 7.22. Let O be an order of conductor f in an imaginary qua- 
dratic field K. Then there are natural isomorphisms 


C(O) = 100 f)/PO,F) = Ik (f)/Px2(f); 


where Px7(f) is the subgroup of Ix(f) generated by principal ideals of the 
form aOx, where a € Ox satisfies a= a mod f Ox for some integer a rela- 
tively prime to f. 


Remark. To keep track of the various ideal groups, remember that the sub- 
script K refers to the maximal order Ox (as in Ix, Ix(f), etc.), while no 
subscript refers to the order O (as in I(O), I(O,f), etc.). 


Proof. The first isomorphism comes from Proposition 7.19. To prove the 
second, note that ar+aQOx induces an isomorphism I(O,f)~ Ix(f) by 
Proposition 7.20. Under this isomorphism P(O, f) c I(O, f) maps to a sub- 
group P C Ix(f). It remains to prove P = Px 2(f). 
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We first show that for a € Ox, 


(7.23) a=amod fOx, ae Z, ged(a,f) =1 
<> a€ 0, gcd(N(a), f) = 1. 


Going one way, assume that a=amod fOx, where aéZ is relatively 
prime to f. Then N(a)= a? mod f follows easily (see Exercise 7.27), so 
that gcd(N(q), f) = gcd(a’, f) = 1. Since f Ox C O, we also see that ac O. 
Conversely, let a € O = [1, fwx] have norm prime to f. Then a =a mod 
fOx for some aéZ. Since gcd(N(a),f) =1 and N(a) =a? mod f, we 
must have gcd(a, f) = 1, and (7.23) is proved. 

We know that P(O,f) is generated by the ideals a0, where a € O and 
N(aq) is relatively prime to f. Thus P is generated by the corresponding 
ideals aOx, and by (7.23), this implies that P = Px(O,f). Q.E.D. 


In §9 we will use this proposition to link C(O) to the class field theory 
of K. For other discussions of the relation between ideals of O and Ox, 
see Deuring [24, §8] and Lang [73, §8.1]. 


D. The Class Number 


One of the nicest applications of Proposition 7.22 is a formula for the class 
number /(Q) in terms of its conductor f and the class number h(Ox) of 
the maximal order. Before we can state the formula, we need to recall some 
terminology from §5. Given an odd prime p, we have the Legendre symbol 
(dx/p), and for p = 2 we have the Kronecker symbol: 


0 if 2| dx 
(*) = 1 if dx =1mod8 
—1 if dx =5 mod 8. 


(Recall that dx = 1 mod 4 when dx is odd.) We can now state our formula 
for h(O): 


Theorem 7.24. Let O be the order of conductor f in an imaginary quadratic 


field K. Then _ h(Ox)f ‘ 
XO= Toe: jogo (1 G35) 


Furthermore, h(©) is always an integer multiple of h(Ox). 


Proof. By Theorem 7.7 and Proposition 7.22, we have 
h(O) = |C(O)| = Ux(P)/Px.2f)I 
h(Ox) = |C(OKx)| = [x/Px\- 
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Since Ix(f) C Ix and Px7(f) Cc Ik(f)N Px, we get an exact sequence 


(7.25) 
O — Ik(f)OPK/Pxz7) — Ik(f)/PxzP) — Ix/Px 
{2 Lt 
C(O) — C(Ox). 


We know from Corollary 7.17 that every class in C(Ox) contains an Ox- 
ideal whose norm is relatively prime to f . This implies that C(O) — C(Ox) 
is surjective, which proves that h(Ox) divides h(O). Furthermore, (7.25) 
then implies that 

AO) _ 
(7.26) Ox) Ik (f) 9 Px /Px,2z(f)I- 
It remains to compute the order of Ix(f)N Px/Px,z(f). The key idea is to 
relate this quotient to (Ox /fOx)*. 

Given [a] € (Ox/fOx)*, the ideal aOx is prime to f and thus lies in 
Ik(f)N Px. Furthermore, if a = £8 mod fOx, we can choose ué€ O with 
ua = ub = 1mod f Ox. Then the ideals uaOx and uGOx lie in Px 7(f), 
and since 

aOx -uBOx = BOx -uaOx, 


aOx and BOx lie in the same class in Ix(f )N Px /Px,z(f ). Consequently, 
the map 
$:(Ox/f Ox)” — Ik(f) 0 Px/Px2z(f) 


sending [a] to [a@Ox] is a well-defined homomorphism. 

We will first show that @ is surjective. An element of Ix(f)N Px can be 
written as aOx = ab~!, where a € K and a and 6 are Ox-ideals prime 
to f. Letting m = N(b), we’ve seen that b = mb~', so that maOx = ab, 
which implies that ma € Ox. Note also that maOx is prime to f. Since 
mOk € Px7(f), it follows [a@Ox] = [maOx] = ¢([ma]), proving that ¢ is 
surjective. 

To determine the kernel of ¢, we will assume that O% = {+1} (by Ex- 
ercise 5.9, this means that K # Q(/—3) or Q(i)). In this case we will show 
that there is an exact sequence 


(7.27) 1 — (2/fZ)* © (Ox/fOx)” 2 Ie(f)N Px/Pxz(f) — 1 


where w is the obvious injection. The definition of Px,z(f) makes it clear 
that im(#) Cc ker(¢). Going the other way, let [a] € ker(@). Then aOx € 
Pxz(f), ie., ©Ox = BOx-y~'Ox, where f and ¥ satisfy 6 = b mod f Ox 
and y =c mod f Ox for some [b] and [c] in (Z/fZ)*. Since OF = {+1}, it 
follows that a = +Gy~', and one then easily sees that [b][c]~! € (Z/fZ)* 
maps to [a] € (Ox/f Ox)*. This proves exactness. 
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It is well-known that 


‘ 1 
(Z/f2"| =F (1- *) | 
P\f 
and in Exercises 7.28 and 7.29 we will show that 
- 1 d 1 
((Ox/fOx)*| = PT] (1- 5) (1- (=)=). 
P P/Pp 
P\f 
Using these formulas and (7.26), we obtain 


h(ftdx) _ _ —(dx\1 
Tage = (IO Pe Pra) T1( (“)-), 


which proves the desired formula since |O7| = |O*| = 2. In Exercise 7.30 
we will indicate how to modify this argument when OF # {+1}. Q.ED. 


This theorem may also be proved by analytic methods—see, for example, 
Zagier [111, §8, Exercise 8]. 

Using Theorem 7.24, we can relate the class numbers h(m?D) and h(D) 
as follows: 


Corollary 7.28. Let D =0,1 mod 4 be negative, and let m be a positive inte- 


ger. Then apy _h(D)m_ _(PN1 
MmPD)= eee (1 (S)s) 


where O and O' are the orders of discriminant D and m*D, respectively (and 
O' has index m in ©). 


Proof. Suppose that the order O has discriminant D and conductor f . Then 
the order 0’ C O of index m has discriminant m*D and conductor mf , and 
the corollary follows from Theorem 7.24 (see Exercise 7.31). This corollary 
is due to Gauss, and his proof may be found in Disquisitiones [41, §§254— 
256]. Q.E.D. 


The only method we learned in §2 for computing class numbers h(D) 
for D < 0 was to count reduced forms. This becomes awkward as |D| gets 
large, but other methods are available. By Theorem 7.24, we are reduced 
to computing h(dx), and here one has the classic formula 


|dx|—1 


(7.29) h(dk)= >> (*)n, 


n=1 
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where (dx /n) is defined for n = p1---p,, pi prime, by (dx /n) = [];_,(dx/ 
pi). This formula is usually proved by analytic methods (see Borevich and 
Shafarevich [8, Chapter 5, Section 4], or Zagier [111, §9]), but there is also 
a purely algebraic proof (see Orde [83]). 

While (7.29) enables us to compute h(dx) for a given imaginary qua- 
dratic field, it doesn’t reveal the way h(dx) grows as |dx| gets large. Gauss 
noticed this empirically in Disquisitiones [41, §302], but there were no com- 
plete proofs until the 1930s. The best result is due to Siegel [92], who 
proved in 1935 that 

logh(dx) _ 
gees log|dx| ee) 


This implies that given any € > 0, there is a constant C(e) such that 
h(dx) > C(€)|dx|°/?* 


for all field discriminants dx < 0. Unfortunately, the constant C(e) in Sie- 
gel’s proof is not effectively computable given what we currently know 
about L-series (these difficulties are related to the Riemann Hypothesis). 
However, recent work by Goldfeld, Gross, Zagier and Oesterlé has led to 
the weaker formula 


1 [2VP] 
h(dx) > 7000 nid (1- pt 1 log|dx|, 
K 


where [ ] is the greatest integer function. For a fuller discussion of this 
result and its implications, see Oesterlé [81] or Zagier [112]. 

These results on the growth of h(dx) imply that there are only finitely 
many orders with given class number h (see Exercise 7.32). Nevertheless, 
even when /: is small, determining exactly which orders have class number 
h remains a difficult problem. For the case of class number 1, the answer 
is given by the following theorem due independently to Baker [3], Heegner 
[52] and Stark [96]: 


Theorem 7.30. 

(i) If K is an imaginary quadratic field of discriminant dx, then 
h(dx)=1 <=> dx = —3,—4,-—7,-8, —11, —19, —43, —67, — 163. 

(ii) Jf D =0,1 mod 4 Is negative, then 


h(D) =1 <> D =—3,—4,—7,-8,—-11,—12,—-16, 
1920798 —43 67 ~163: 
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Proof. First note that (1) = (ii). To see this, assume f(D) = 1. If we write 
D = f?dx, then Theorem 7.24 tells us that A(dx)|h(D), and thus h(dx) = 
1. By (i), this determines the possibilities for dx, but we still need to see 
which conductors f > 1 can occur. First, suppose that OF = {+1}. If f > 2, 
then 


so that by Theorem 7.24, this case can be excluded. One then calculates 
directly (using (1) and Theorem 7.24) that f = 2 happens only when dx = 
—7, i.e, D = —28. The argument when O% # {+1} is similar and is left to 
the reader (see Exercise 7.33). 

The proof of (1) is a different matter. When the discriminant is even. the 
theorem was proved in §2 by an elementary argument due to Landau (see 
Theorem 2.18). But when the discriminant is odd, the proof 1s much more 
difficult. In §12 we will use modular functions and complex multiplication 
to give a complete proof of (i). Q.E.D. 


E. Exercises 


7.1. Let K be a finite extension of Q of degree n, and let M CK bea 
finitely generated Z-module. 
(a) Prove that M is a free Z-module. 
(b) Prove that M has rank n if and only if M contains a Q-basis 
of K. 


7.2. Let O be an order in a quadratic field K. Prove that OC Ox. 


7.3. This exercise is concerned with the conductor and discriminant of an 
order © in a quadratic field K. Let @++ a’ be the nontrivial auto- 
morphism of K. 

(a) If O =[a,f], then the discriminant is defined to be 


o-(1(% 6) 


Prove that the D 1s independent of the basis used and hence de- 
pends only on OQ. 

(b) Use the basis O = [1,fwx] from Lemma 7.2 to prove that D = 
fed. 

(c) Use (b) and Lemma 7.2 to prove that an order in a quadratic field 
is uniquely determined by its discriminant. 


2 


7.4. 


13% 


7.6. 


Tels 


7.8. 


7.9. 


7.10. 


7.11. 
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(d) If D =0,1 mod 4 is nonsquare, then show that there is an order 
in a quadratic field whose discriminant is D. 

Let O be an order in a quadratic field K. 

(a) If a is a nonzero ideal of O, prove that a contains a nonzero 
integer m. Hint: take a@€ a, and use Lemma 7.2 to show that 
a’ € O, where a+ a’ is the nontrivial automorphism of K . 

(b) If a is a nonzero ideal of O, show that O/a is finite. Hint: take 
the integer m from (a) and show that O/m0O is finite. 

(c) Use (b) to show that every nonzero prime ideal of O is maximal. 

(d) Use (b) to show that O is Noetherian. 

Let K = Q(V-—3), and let a be the ideal of O = Z[,/—3] generated 

by 2 and 1+ /—3. Show that 


{GEK:Gaca}=Ox £O. 


Let K be a quadratic field. 
(a) Show that for any order of K , principal ideals are always proper. 
(b) Show that for the maximal order Ox, all ideals are proper. 


Let O be an order of K, and let b C K be an O-module (note that b 
need not be contained in O). Show that b is finitely generated as an 
O-module if and only if 6 is of the form aa, where a € K and a is an 
O-ideal. 


Show that a nonzero fractional O-ideal a is a free Z-module of rank 
2. Hint: use the previous exercise and part (b) of Exercise 7.4. 


Let O = Z[/—3], which is an order of conductor 2 in the imaginary 

quadratic field K = Q(/—3). 

(a) Show that C(O) ~ {1}, so that the proper ideals of O are exactly 
the principal ideals. Hint: use Theorem 7.7 and what we know 
from §2. 

(b) Show that if unique factorization holds for proper ideals of O, 
then O is a UFD. 

(c) Show that 2, 1+ /—3 and 1— /—3 are irreducible (in the sense 
of §4) in O. Since 4=2-2=(1+ V/—3)(1— V—3), this shows 
that O is not a UFD. 

This example shows that unique factorization can fail for proper 

ideals. 


If a and b are invertible fractional ideals for an order ©, then 
prove that ab and a~! (where a~! is the fractional O-ideal such that 
aa~! = ©) are also invertible fractional O-ideals. 


Prove (7.10). 
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Let f(x,y) = ax? + bxy + cy* be a quadratic form with integer co- 

efficients, and let 7 be a root of ax? + bx +c=0. 

(a) Prove that f(x,y) is positive definite if and only if a >0 and 
TER. 

(b) When f(x,y) is positive definite and gcd(a,b,c) = 1, prove that 
the discriminant of f(x,y) is D, where D is the discriminant of 
the order O = [1,aT]. 

(c) Prove that two primitive positive definite forms which have the 
same root 7 must be equal. 

Let ax? + bxy + cy? and a'x* + b'xy +c'y? be two primitive posi- 

tive definite forms of the same discriminant. Assume that gcd(a,a’, 

(b + b’)/2) = 1, and let B be the unique integer modulo 2aa' which 

satisfies the three conditions of (7.12). Prove that gcd(a,a',B) = 1. 


Let O = [1,u] be an order in a quadratic field, and pick a= a+ bue 

O,a#0. Since O is a ring, au can be written au =c+ du. 

(a) Show that N(a) = ad—bc #0. 

(b) Since aO =[a,au] =[a+bu,c+du]C O=[1,u] and ad- 
bce #0, it is a standard fact (proved in Exercise 7.15) that 
|O/aO| = |ad — bc|. Thus (a) proves the general relation that 
N(aO) = |N(a)|. 

Let M =2Z’, and let A= (4%) be an integer matrix with det(A) = 

ad —bc #0. Writing M = [e1,e2], note that AM = [ae + be2,ce; + 

dé}. Our goal is to prove that |M/AM|=|det(A)|. Let A’ = 

(4 <”), and note that 4A! = 4!A = det(A)I. 


(a) Show that det(A)M C AM and that AM/det(A)M ~ M/A'M. 
(b) Use (a) and the exact sequence 


0 — AM /det(A)M — M/det(A)M — M/AM — 0 


to show that |M/AM||M/A'M| = (det(A)). 

(c) Let © = ({~)). Using ©. 40-1 = 4’, show that ©: M > M in- 
duces an isomorphism M/AM > M/A'M. 

(d) Conclude that |M/AM| = |det(A)]. 

Let O be the order of discriminant D in an imaginary quadratic field 

K, and let a be a proper O-ideal. In this exercise we will give two 

proofs that the class of @ is the inverse of the class of a in C(O). 

(a) Prove this assertion using part (iii) of Lemma 7.14. 

(b) In §3, we proved that the class of the opposite f'(x,y) = ax? — 
bxy + cy is the inverse of the class of f(x,y) = ax? + bxy + 
cy*. Using the isomorphism C(D) ~ C(O) from Theorem 7.7, 
show that the class of @ is the inverse of the class of a in C(O). 


7.17. 


7.18. 


7.19. 


7.20. 
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Let O be the order of discriminant D in the imaginary quadratic 

field K. 

(a) Show that the map sending the proper O-ideal a = [a,(] to the 
quadratic form 


Fay) = Nazi 


induces a well-defined map C(O) — C(D) which is the inverse 
of the map ax? + bxy + cy* + [a,(—b + VD)/2] of Theorem 7.7. 
Hint: use (7.16) and Exercise 7.12. 

(b) Give examples to show that the map ax*+bxy+cy?»H 
[a,(—b + VD)/2] of Theorem 7.7 is neither injective nor surjec- 
tive on the level of forms and ideals. 


Let K = Q(V3), a field of discriminant dx = 12. By (5.13), we know 

that Ox = Z[V3]. 

(a) Use the absolute value of the norm function to show that Ox is 
Euclidean, and conclude that C(Ox) ~ {1}. 

(b) Show that the form class group C(dx) = C(12) is nontrivial. 
Hint: show that the forms +(x? — 3y”) are not properly equiv- 
alent. You will need to show that the equation a” — 3c? = —1 
has no solutions. 


This shows that C(dx) # C(Ox) for K = Q(V3). 


In Exercises 7.19-7.24 we will explore two versions of Theorem 7.7 

that hold for real quadratic fields K. To begin, we will study the 

orientation of a basis a, § of a proper ideal a = [a,] of an order 

O in K. Let a+ a’ denote the nontrivial automorphism of K. 

(a) Prove that a’8 — af’ € R*. We then define sgn(a,) to be the 
sign of the nonzero real number a’ — af’. 


(b) Let (24) € GL(2,Z), and set &= pa+qf, B =ra+sf. Note 
that a = [a, 3] = [4,6]. Prove that 


sen(@, 3) = det @ - sen(a, (3). 


We say that a, @ are positively oriented if sgn(a,() > 0 and neg- 
atively oriented otherwise. By (b), two bases of a have the same 
orientation if and only if their transition matrix is in SL(2,Z). 


Theorem 7.7 was proved using a map from quadratic forms to ideals. 
In the real quadratic case, such a map is harder to describe (see 
Exercise 7.24), but it is relatively easy to go from ideals to forms. 
The goal of this exercise is to show how this is done. Let O be an 
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order in a real quadratic field K, and let a = [a,f] be a proper O- 
ideal. Then define the quadratic form f(x,y) by the formula 


_ N(ax— py) 

f(x,y) = N(a) . 

At this point, all we know is that f(x,y) has rational coefficients. 

Let 7 = 8/a, and let ax* + bx +c be the minimal polynomial of 7. 

We can assume that a,b,c € Z, a > 0 and gcd(a,b,c) = 1. 

(a) Prove that N(a) = |N(q@)|/a. Hint: adapt the proof of (7.16) to 
the real quadratic case. Exercise 7.14 will be useful. 

(b) Use (a) to prove that f(x,y) =sgn(N(a))(ax? + bxy + cy’). 
Thus f(x,y) has relatively prime integer coefficients. 

(c) Prove that the discriminant of f(x,y) is D, where D is the dis- 
criminant of O. Hint: see Exercise 7.12. 


In this exercise we will construct a bijection C*(O) ~ C(D), where 

C*(Q) is defined in the text. 

(a) Let a be a proper O-ideal, and write a = [a,(] where sgn(a, f) 
>0O (see Exercise 7.19). Then let f(x,y) be the correspond- 
ing quadratic form defined in Exercise 7.20. If &, 6 is another 
positively oriented basis of a, then show that the corresponding 
form g(x,y) from Exercise 7.20 is properly equivalent to f(x,y). 
Furthermore, show that all forms properly equivalent to f(x,y) 
arise in this way. 

(b) If Ae O and N(A) > 0, then show that Aa gives the same class 
of forms as a. Hint: show that sgn(Aa, A) = sgn(N(A))sgn(a, £). 

(c) From (a), (b) and Exercise 7.20 we get a well-defined map 
C*(O) — C(D). To show that the map is injective, suppose that 
a and a give the same class in C(D). By (a), we can choose 
positively oriented bases a = [a, 3] and a = [4, 3] which give the 
same form f(x,y). 

(i) Using Exercise 7.19, show that sgn(V(q@)) = sgn(V(@)), i.e., 
N(a@) > 0. Then replacing a and @ by ada and aa respec- 
tively allows us to assume that a = 4@, ie., a =[a, A] and 
a = [a,f}. 

(ii) Let 7 = B/a and 7 = B/a. Show that f (7,1) = f(7,1) =0, 
so that 7=7 or 7’. Then show that 7 =7' contradicts 
sgn(a, 3) > 0, which proves that 6 = 8. 

(d) To prove surjectivity, let f(x,y) =ax*+bxy +cy? be a form 
of discriminant D, and let 7 be either of the roots of ax* + 
bx +c =0. First show that at € O. Then define an O-ideal a as 
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follows: if a > 0, then 
a=[a,at] — where f(7,1) =0, sgn(1,7T) > 0, 
and if a < 0, then 
a= Vdx{a,aT] where f (7,1) = 0, sgn(1,7) < 0. 


Show that a is a proper O-ideal and that the form corresponding 
to a from Exercise 7.20 is exactly f(x,y). 


This completes the proof that C*(O) — C(D) is a bijection. 


7.22. In this exercise we will construct a bijection C(O) ~ C;(D), where 
C;(D) is defined in the text. Our treatment of C;(D) is based on 
Zagier [111, §8]. 

(a) Let a = [a,{] be a proper O-ideal, where this time we make no 
assumptions about sgn(a, 3). Define f(x,y) to be the quadratic 


form N(ax— By) 
f(%,y) = sen(a,p)— a 


which by Exercise 7.20 has relatively prime integer coefficients 
and discriminant D. Show that as we vary over all bases of a, 
the corresponding forms vary over all forms signed equivalent 
to f (x y )- 

(b) Show that the map ar f(x,y) of (a) induces a well-defined bi- 
jection C(O) ~ C;(D). Hint: adapt the arguments of parts (b)}- 
(d) of Exercise 7.21. 


7.23. This exercise will explore the relations between C(O), C*(O), C(D) 
and C;(D). 
(a) Let K be an imaginary quadratic field. 

(i) Show that P*(O)= P(O), so that C*(O) always equals 
C(O). 

(ii) The relation between C(D) and C,(D) is more interest- 
ing. Namely, in C(D), we had to explicitly assume that we 
were only dealing with positive definite forms. However, in 
C;(D), one uses both positive definite and negative definite 
forms. Show that any negative definite form is signed equiv- 
alent to a positive definite one, and conclude that C(D) ~ 
C;(D). 

(b) Now assume that K is a real quadratic field. 
(i) Show that there are natural surjections 


c+(0) — C(O) 
C(D) — C,(D) 
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which fit together with the bijections of Exercises 7.21 and 
7.22 to give a commutative diagram 


c*(O) — C(D) 
4 | 
C(O) — CD) 
(ii) Show that the kernel of C*(O)— C(O) is P(O)/P*(O) 
and that P(O) = P*(O)U VdxP*(O). Then conclude that 
Ict+(O)| 1 if O has a unit of norm —1 
IC(O)| 


2 otherwise. 


(iii) From (i) and (ii), conclude that 


|C(D)| 1 if O has a unit of norm —1 


|Cs(D)| 7 2 otherwise. 


Write down inverses to the bijections C+(O)  C(D) and C(O) > 

C;(D) of Exercises 7.21 and 7.22. Hint: see part (d) of Exercise 7.21. 

Note that the answer is more complicated than the map ax? + bxy 

+cy? — [a,(—b + VD)/2] of Theorem 7.7. 

Let ¢: {Ox-ideals prime to f} — {O-ideals prime to f} be a bijec- 

tion which preserves multiplication. Show that ¢ extends to an iso- 

morphism @: Ik(f) = I(O,f). 

Let O be an order of conductor f. 

(a) Let a be an ideal of O which is relatively prime to f. Prove that 
a is a prime O-ideal if and only if aOx is a prime Ox-ideal. 
Hint: use Proposition 7.20 to show that O/a~ Ox/aOx. 

(b) Use (a) and the unique factorization of ideals in Ox to show 
that O-ideals relatively prime to the conductor can be factored 
uniquely into prime O-ideals (which are also relatively prime 
to f). 

If a,8 € Ox and a= mod mOx for some integer m, then prove 

that N(a) = N(8) mod m. 

Let K be a quadratic field, and let p be prime in Ox. The goal of 

this exercise is to prove that 


(Ox /p")*| = N(p)"~" (N(p) - 1). 


The formula is true if n = 1, and the general case follows easily by 
induction once we prove that there is an exact sequence 


1— Ox/p 2+ (Ox/p")* — (Ox/p""!)* 1 


Lye 
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for n > 2. For the rest of the exercise fix an integer n > 2. 

(a) Show that (Ox/p")* = (Ox /p"~')* is onto. Hint: take [a] € 
(Ox/p"—')*, which means that a6 = 1+, where § € Ox and 
yép"—!. Then show that a(8+76)—1¢ p” for an appropri- 
ately chosen 6. 

(b) By unique factorization, we know that p” is a proper subset of 
p”-l. Pick wé p”—! ‘such that u¢ p”. 

(i) Given a € Ox, show that [1 + au] € (Ox/p”)*. 

(ii) From (1), it 1s easy to define a map ¢: Ox/p — (Ox /p”)*. 
With this definition of ¢, show that the above sequence is 
exact. 


Let K be an imaginary quadratic field. 
(a) Let a = []/_, p” be the factorization of a into primes. Show that 
there is a natural isomorphism 


Ox/a~ | [(Ox/p). 


t=1 


This is the Chinese Remainder Theorem for Ox. Hint: it is easy 
to construct a map and show it is injective. Then use part (i1) of 
Lemma 7.14. 

(b) Use (a) and the previous exercise to show that if a is a nonzero 
ideal of Ox, then 


(Ox /ay"| = TT (3- - aw): 


Notice the similarity to the usual formula for $(n) = |(Z/nZ)*|. 
(c) If m is a positive integer, conclude that 


1 d 
couimoer'-HE(1-3) (1-(8)2). 
p Py Pp 
p|m 
where (dx/p) is the Kronecker symbol when p = 2. 


Let K be any quadratic field, and let f be a positive integer. 
(a) Use the obvious maps 


{+1} — (Z/fZ)* x O% 
(Z/fZ)" x Ox — (OK/fOK) 
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and the maps from (7.27) to prove that there is an exact se- 
quence 


1— {+1} — (Z/fD)* x Ok — (Ox/f Ox)" 
— Ik(f) 0 Px /Px2(f) — 1. 
Notice that when O;% = {+1}, this sequence is equivalent to 
(7.27). 


(b) Use the exact sequence of (a) to prove Theorem 7.24 for all 
imaginary quadratic fields. 


Prove Corollary 7.28. 


In this exercise we will use the inequality 


1 [2V/P] 

h(d ——~ 1— —*— }log|d 

(*) (dx) > sag5 TL (1- S27) too la 
p\dx 

to study the equation h(dx) = h, where h > 0 is a fixed integer and 

dx varies over all negative discriminants. 

(a) Show that 1—[2,/p]/(p + 1) > 1/2 when p > 11. 


(b) If h(dx) = h, then use (a) and genus theory to conclude that 
Tf (1- 2VP]\. 1 
pti] — 3-2”h)+2’ 


where v2(h) is the highest power of 2 dividing h. Hint: use The- 
orem 3.15 or 6.1 to show that dx is divisible by at most v2(h) + 1 
distinct primes. 

(c) If h(dx) = h, then show that (*) gives us the following estimate 
for |dx|;: 


ldx| < e21000-2"2*%h 


This proves that there are only finitely many negative discrimi- 
nants with class number at most h. Better bounds for |dx| can 
be derived from (*) (see Oesterlé [81]), but the constant 1/7000 
in (*) limits their usefulness. For discriminants prime to 5077, 
Oesterlé hopes to improve this constant from 1/7000 to 1/55, 
which would give an estimate strong enough to solve the class 
number 3 problem (see [81}). 

(d) If h is fixed and D = 0,1 mod 4 varies over all negative integers, 
show that the equation h(D) =h has only finitely many solu- 
tions. Hint: use genus theory to bound the number of primes 
dividing D, and then use Theorem 7.24. 
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7.33. In Theorem 7.30, complete the proof of (i) => (ii) sketched in the 
text. 


§8. CLASS FIELD THEORY AND THE CEBOTAREV DENSITY 
THEOREM 


In this section we will present a classical formulation of class field theory, 
where Abelian extensions of a number field are described in terms of cer- 
tain generalized ideal class groups. After stating the main theorems (without 
proof), we will illustrate their use by proving the Kronecker-Weber Theo- 
rem and the existence of the Hilbert class field. We will then discuss gener- 
alized reciprocity theorems for the nth power Legendre symbol (a/p), and 
show how quadratic reciprocity follows from class field theory. 

The Cebotarev Density Theorem hasn’t been mentioned before, but it 
provides some important information about the behavior of the Artin map. 
One of its classic applications is Dirichlet’s theorem on primes in arithmetic 
progressions, and in §9 we will use the same methods to study primes repre- 
sented by a given quadratic form. Another consequence of the Density The- 
orem is that a Galois extension of a number field is determined uniquely 
by the primes in the base field that split completely in the extension. As we 
will see, this is closely related to our basic problem of characterizing the 
primes represented by x? + ny?. 

Our account of class field theory will be incomplete in several ways, and 
at the end of the section we will discuss two of the most obvious omissions, 
norms and ideles. 


A. The Theorems of Class Field Theory 


We begin our treatment of class field theory with the notion of a modulus. 
Given a number field K , a modulus in K is a formal product 


m =||r” 
p 


over all primes p, finite or infinite, of K, where the exponents must sat- 
isfy: 

(i) ny > 0, and at most finitely many are nonzero. 

(ii) ny» = O wherever p is a complex infinite prime. 
(iii) Mp) < 1 whenever p is a real infinite prime. 
A modulus m may thus be written mpm., where mo is an Ox-ideal and 
M.. is a product of distinct real infinite primes of K. When all of the 
exponents ny = 0, we set m = 1. Note that for a purely imaginary field K 
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(the case we’re most interested in), a modulus may be regarded simply as 
an ideal of Ox. 

Given a modulus m, let Jx(m) be the group of all fractional Ox -ideals 
relatively prime to m (which means relatively prime to mo), and let Px,1(m) 
be the subgroup of Jx(m) generated by the principal ideals a@Ox, where 
a € Ox Satisfies 


a =1 mod mp and a(a) > 0 for every real infinite prime o dividing mq. 


A basic result is that Px.1(m) has finite index in Jx(m). When K is imag- 
inary quadratic, this is proved in Exercise 8.1, while the general case may 
be found in Janusz [62, Chapter IV.1]. A subgroup H C Jx(m) is called a 
congruence subgroup for m if it satisfies 


Pxi(m) Cc HC Ix(m), 


and the quotient 
Ik(m)/H 


is called a generalized ideal class group for m. 

For an example of these concepts, consider the modulus m = 1. Then 
Px = Px,1(1) is a congruence subgroup, so that the ideal class group C(Ox) 
= Ix/Px is a generalized ideal class group. We also get some interesting 
examples from §7. Let O be an order of conductor f in an imaginary qua- 
dratic field K. In Proposition 7.22 we proved that the ideal class group 
C(O) can be written 


C(O) = Ik(f)/Px,z(f), 


where Px,z(f) is generated by the principal ideals a@Ox for a=a mod 
fOx,aeé€Z and gced(a,f) = 1. If we use the modulus fOx, then the defi- 
nition of Px,1(f Ox) shows that 


(8.1) Pxra(fOx) C Px,zf) C Ik) = Ik Or); 


and thus Px 7(f) is a congruence subgroup for f Ox . This proves that C(O) 
is a generalized ideal class group of K for the modulus fOx. In §7, the 
group Px,z(f) seemed awkward, but it’s a very natural object from the point 
of view of class field theory. 

The basic idea of class field theory is that the generalized ideal class 
groups are the Galois groups of all Abelian extensions of K, and the link 
between these two is provided by the Artin map. To make this precise, we 
need to define the Artin map of an Abelian extension of K. 

Let m be a modulus divisible by all ramified primes of an Abelian ex- 
tension K C L. Given a prime p not dividing m, we have the Artin symbol 


(==) € Gal(L/K) 
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from §5. As in the discussion preceding Theorem 5.23, the Artin symbol 
extends by multiplicativity to give us a homomorphism 


}m_ 2 Ix(m) —> Gal(L/K) 


which is called the Artin map for K C L and m. When we want to refer 
explicitly to the extension involved, we will write ©; /x,m instead of ®m. 

The first theorem of class field theory tells us that Gal(L/K) is a gener- 
alized ideal class group for some modulus: 


Theorem 8.2. Let K C L be an Abelian extension, and let m be a modulus 

divisible by all primes of K, finite or infinite, that ramify in L. Then: 

(i) The Artin map ©» is surjective. 

(ii) If the exponents of the finite primes dividing m are sufficiently large, then 
ker(®m) is a congruence subgroup for m, 1.e., 


Pxi(m) C ker(®m) C [x(m), 
and consequently the isomorphism 
Ix(m)/ker(®m) —> Gal(L/K) 


shows that Gal(L/K) is a generalized ideal class group for the modu- 
lus m, 


Proof. See Janusz [62, Chapter V, Theorem 5.7]. Q.E.D. 


This theorem is sometimes called the Artin Reciprocity Theorem. The 
key ingredient is the condition Px,1(m) C ker(®,), for it says (roughly) that 
the Artin symbol ((L/K)/p) depends only on p up to multiplication by 
a, @=1mod m. Later in this section we will see how Artin Reciprocity 
relates to quadratic, cubic and biquadratic reciprocity. 

Let’s work out an example of Theorem 8.2. Consider the extension Q C 
Q(Gm), where Gn = e?™/™ is a primitive mth of unity, and let m be the 
modulus moo, where oo is the real infinite prime of Q. Using Proposition 
5.11, one sees that any prime not dividing m is unramified in Q(C,,) (see 
Exercise 8.2), and it follows that the Artin map 


Bq t Jq(t) —+ Gal(Q(Gn)/Q) ~ (Z/mZ)" 


is defined. ®, can be described as follows: given (a/b)Z € Jg(m), where 
(a/b) > 0 and gcd(a, m) = gcd(b, m) = 1, then 


(8.3) Dn (52) = [a][b}"! € (Z/mZ)*. 
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It follows easily that 
(8.4) ker(®m) = Pgi(m) 


(see Exercise 8.2). The importance of this computation will soon become 
clear. 

One difficulty with Theorem 8.2 is that the m for which ker(®,,) is a 
congruence subgroup is not unique. In fact, if Px1(m) C ker(®) and n is 
any modulus divisible by m (it’s clear what this means), then 


Pxi(m) C ker(@m) > Pxi(n) C ker(®n) 


(see Exercise 8.4), so that Gal(L/K) is a generalized ideal class group for 
infinitely many moduli. However, there is one modulus which is better than 
the others: 


Theorem 8.5, Let K C L be an Abelian extension. Then there is a modulus 
= {(L/K) such that 
(i) A prime of K, finite or infinite, ramifies in L if and only if it divides §. 
(ii) Let m be a modulus divisible by all primes of K which ramify in L. Then 
ker(®m) is a congruence subgroup for m if and only if f | m. 


Proof. See Janusz [62, Chapter V, 86 and Theorem 12.7]. Q.E.D. 


The modulus f(L/K) is uniquely determined by K C L and is called the 
conductor of the extension, and for this reason Theorem 8.5 is often called 
the Conductor Theorem. In Exercise 8.5 we will compute the conductor of 
QC Q(Gm) (it need not be m), and in §9 we will compute the conductor of 
a ring class field. 

The final theorem of class field theory is the Existence Theorem, which 
asserts that every generalized ideal class group is the Galois group of some 
Abelian extension K C L. More precisely: 


Theorem 8.6. Let m be a modulus of K, and let H be a congruence subgroup 
for m, Le, 
Pxi(m) CHC Ik(m). 


Then there is a unique Abelian extension L of K, all of whose ramified 
primes, finite or infinite, divide m, such that if 


®m : Ik(m) — Gal(L/K) 
is the Artin map of K C L, then 
H = ker(®,). 
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Proof. See Janusz [62, Chapter V, Theorem 9.16]. Q.E.D. 


The importance of this theorem is that it allows us to construct Abelian 
extensions of K with specified Galois group and restricted ramification. 
This will be very useful in the applications that follow. 

Now that we've stated the basic theorems of class field theory, the next 
step is to indicate how they are used. We will start with two of the nicest 
applications: proofs of the Kronecker-Weber Theorem and the existence of 
the Hilbert class field. A key tool in both proofs is the following corollary 
of the uniqueness part of Theorem 8.6: 


Corollary 8.7. Let L and M be Abelian extensions of K. Then LC M if and 
only if there is a modulus m, divisible by all primes of K ramified in either L 
or M, such that 


Pxi(m) C ker(®yjx,m) C ker(®z/xK,m). 


Proof. First, assume that L C M, and let r: Gal(M/K) — Gal(L/K) be the 
restriction map. By Theorem 8.2 and Exercise 8.4, there is a modulus m 
for which ker(®; /x,m) and ker(®y/x.m) are both congruence subgroups for 
m. The proof of Exercise 5.16 shows that ro®y/xKm = ®z/x.m, and then 
ker(®y/x,m) C ker(®z/xK,m) follows immediately. 

Going the other way, assume that Px 1(m) C ker(®y/x,m) C ker(®z/x,m)- 
Then, under the map ®y/xm:/k(m)—Gal(M/K), the subgroup 
ker(®p/xK,m) C Ix(m) maps to a subgroup H C Gal(M/K). By Galois the- 
ory, H corresponds to an intermediate field K C LC M. The first part of 
the proof, applied to L C M, shows that ker(®j ;¢ m) = ker(®z/«K,m). Then 
the uniqueness part of Theorem 8.6 shows that L = LC M, and we are 
done. Q.E.D. 


We can now prove the Kronecker-Weber Theorem, which classifies all 
Abelian extensions of Q: 


Theorem 8.8. Let L be an Abelian extension of Q. Then there is a positive 
integer m such that L C Q(6m), Gn = O7™™. 


Proof. By the Artin Reciprocity Theorem (Theorem 8.2), there is a modulus 
m such that Pg@i(m) C ker($;/@m), and by Exercise 8.4, we may assume 
that m = moo. By (8.4) we know that Pgi(m) = ker(®Qc,,)/@,m), So that 


Paa(m) = ker(PaQ¢,,)/Qm) C ker(®z/x,m): 
Then L C Q(¢m) follows from Corollary 8.7. Q.E.D. 
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We should mention that the Kronecker-Weber Theorem can be proved 
without using class field theory (see Marcus [77, Chapter 4, Exercises 29- 
36]). 

Next, let’s discuss the Hilbert class field. To define it, apply the Existence 
Theorem (Theorem 8.6) to the modulus m = 1 and the subgroup Px C Ix 
(note that Px = Pxj(m) in this case). Thus there is a unique Abelian ex- 
tension L of K, unramified since m = 1, such that the Artin map induces 
an isomorphism 


(8.9) C(Ox) = Ik/Px —> Gal(L/K). 
L is the Hilbert class field of K , and its main property is the following: 


Theorem 8.10. The Hilbert class field L is the maximal unramified Abelian 
extension of K. 


Proof. We already know that L is an unramified extension. Let M be an- 
other unramified extension. The first part of the Conductor Theorem (The- 
orem 8.5) implies that {(M/K) = 1 since a prime ramifies if and only if it 
divides the conductor, and then the second part tells us that ker(®y/x,1) is 
a congruence subgroup for the modulus 1, so that 


Px C ker(®y/x,1)- 
By the definition of the Hilbert class field, this becomes 
Px = ker(®z/x,1) C ker(®y/x,1), 
and then M Cc L follows from Corollary 8.7. Q.E.D. 


Notice that Theorems 5.18 and 5.23 from §5 are immediate consequences 
of (8.9) and Theorem 8.10. 

There is a generalization of the Hilbert class field called the ray class 
field. Namely, given any modulus m, the Existence Theorem shows that 
there is a unique Abelian extension Kj, of K such that 


Px,(m) = ker(®x,,/K,m)- 


Kw is called the ray class field for the modulus m, and when m = 1, this re- 
duces to the Hilbert class field. Another example is given by the cyclotomic 
field Q(Gm): here, (8.4) shows that Q(¢,,) is the ray class field of Q for the 
modulus moo. We also get a nice interpretation of the conductor f(L/K) 
of an arbitrary Abelian extension L of K: it’s the smallest modulus m for 
which L is contained in the ray class field Km (see Exercise 8.6). 

Besides proving these classical results, class field theory is also the source 
of most reciprocity theorems. In particular, we will discuss some reciprocity 
theorems for the nth power Legendre symbol (a@/p), mentioned in §5. To 
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define this symbol, let K be a number field containing a primitive nth root 
of unity ¢, and let p be a prime ideal of Ox. Then, for a € Ox prime to p, 
we have Fermat’s Little Theorem 


aN®)-1 = 1 mod p. 


Suppose that in addition p is prime to n. It can be shown that n| N(p)—1 
(see Exercise 5.13), and it follows that x = a@)-/" is a solution of the 
congruence x” = 1 mod p. Consequently 


aN@)-D/n = 1,¢,...,C"~* mod p. 


Since the nth roots of unity are distinct modulo p (see Exercise 5.13), 
aN (P)-)/" is congruent modulo p to a unique nth root of unity. This root 
of unity is defined to be the nth power Legendre symbol (a/p)n, so that 
(a/p), satisfies the congruence 


aN (e)-1)/n = (F), mod p. 


This symbol is a natural generalization of the Legendre symbols (a/7)3 and 
(a/m)4 from cubic and biquadratic reciprocity. 

The nth power Legendre symbol can be defined for more general ideals 
as follows: given an ideal a of Ox which is prime to n and a, we set (a@/4a)n 


to be the product 
: 
=o) HG) 
G S GaNPae: 


where a = p,:--p, is the prime factorization of a. Thus, if m is a modulus 
of K such that every prime containing na divides m, then the nth power 
Legendre symbol gives a homomorphism 


ee ‘Tk(™) — Pn 


where pi, C C* is the group of nth roots of unity. 

We will prove two reciprocity theorems for the nth power Legendre sym- 
bol, but first we need to recall a fact from Galois theory. If K has a primi- 
tive nth root of unity, then for a € K, the extension K C L = K(¥/a) is Ga- 
lois, and if o € Gal(L/K), then o(x/a) = ¢¥/a for some nth root of unity 
¢. This gives us a map 0 + (¢, which defines an injective homomorphism 


Gal(L/K) © pn. 
We can now State our first reciprocity theorem for (a@/a),: 


Theorem 8.11 (Weak Reciprocity). Let K be a number field containing a 
primitive nth root of unity, and let L = K(x/a), where a € Ox is nonzero. 
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Assume that m is a modulus divisible by all primes of K containing na, and 
assume in addition that ker(®z/x,m) 1s @ congruence subgroup for m. Then 
there is a commutative diagram 


Te(ny "= Gal(b/K) 
(a/ ‘)n Ln ’ 


where Gal(L/K)-— pin is the natural injection. Thus, if G is the image of 
Gal(L/K) in pin, then the nth power Legendre symbol (a/a), induces a sur- 
jective homomorphism 


(*), : Ix(m)/Px1(m) — GC pin. 


Proof. To prove that the diagram commutes, it suffices to show 


LIK\ (ay _ (%) 
( p ) wa) (5). ve 
This is an easy consequence of the definition of the Artin symbol (from 
Lemma 5.19). The case n = 3 was proved in (5.22), and for general n, see 
Exercise 5.14. 

Turning to the final statement of the theorem, recall that ker(®z /x,m) 
is a congruence subgroup for m. Thus Px,i(m) C ker(®z/x,m) C Jk(™m), so 
that the Artin map ®; ;x,m induces a surjective homomorphism 


Ik(m)/Px,(m) — Ix(m)/ker(®z /x,m) —+ Gal(L/K). 


Using the above commutative diagram, the theorem follows immediately. 
Q.E.D. 


This result is called “Weak Reciprocity” because rather than giving for- 
mulas for computing (a/a),, the theorem simply asserts that the symbol is 
a homomorphism on an appropriate group. Nevertheless, Weak Reciprocity 
is a powerful result. For example, let’s use it to prove quadratic reciprocity: 


Theorem 8.12. Let p and q be distinct odd primes. Then 


(2) (4) = (-1)0-DG-D/4, 


Proof. Recall from §1 that quadratic reciprocity can be written in the form 
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where p* = (—1)?-/2 p. 

The first step is to study Q C Q(,/p*). By (8.3) and (8.4), Gal(Q(¢,)/Q) 
is a generalized ideal class group for the modulus poo, which implies 
that the same is true for any subfield of Q(¢,) (see Exercise 8.7). Since 
Gal(Q(¢,)/Q) is cyclic of order p—1, there is a unique subfield QC K C 
Q(¢,) which is quadratic over Q. Then Gal(K/Q) is a generalized ideal 
class group for poo, which implies that p is the only finite prime of Q that 
ramifies in K. If we write K = Q(,/m), m squarefree, then Corollary 5.17 
implies that m = p*, and hence K = Q(,/p*) (see Exercise 8.7). 

It follows that ker(®@:/p*@,p00) 18 a congruence subgroup for poo, and 
thus by Weak Reciprocity, the Legendre symbol (p*/-) gives a surjective 
homomorphism 


(8.13) Te(poo)/Peai(poo) — {+1}. 


However, the map sending [a] € (Z/pZ)* to [aZ] € Ig(poo)/Pei(po) in- 
duces an isomorphism (Z/pZ)* —> Ig(poo)/Pe@i(poo) (see Exercise 8.7). 
Composing this map with (8.13) shows that (p*/-) induces a surjective ho- 
momorphism from (Z/pZ)* to {+1}. But the Legendre symbol (-/p) is 
also a surjective homomorphism between the same two groups, and since 
(Z/pZ)* is cyclic, there is only one such homomorphism. This proves that 


and we are done. O.E.D. 


The proof just given is closely related to the discussion of quadratic reci- 
procity from §1. Recall that a key result implicit in Euler’s work was Lemma 
1.14, which showed that (D/-) gives a well defined homomorphism defined 
on (Z/DZ)* when D =0,1 mod 4. The above argument uses Weak Reci- 
procity to prove this when D = p*. In this way Weak Reciprocity (or more 
generally, Artin Reciprocity) may be regarded as a far-reaching generaliza- 
tion of Lemma 1.14. 

Before we can state our second reciprocity theorem for the nth power 
Legendre symbol, we need some notation: if a and # are in Ox, then 
(a/BOxK)n is written simply (a@/(), when defined. Then we have the fol- 
lowing reciprocity theorem for (@/f)n: 


Theorem 8.14 (Strong Reciprocity). Let K be a number field containing a 
primitive nth root of unity, and suppose that a, € Ox are relatively prime 
to each other and to n. Then 


(5)»(B). = 1 (%). 


p|noo 
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where (a, 8/P)n is the nth power Hilbert symbol (to be discussed below) and 
oo 15 the product of the real infinite primes of K (which can occur only when 
n= 2). 


Proof. While Weak Reciprocity was an immediate consequence of Artin 
reciprocity, Strong Reciprocity is a different matter, for here one must first 
study the nth power Hilbert symbol 


() 


This symbol is an mth root of unity defined using the local class field theory 
of the completion K, of K at the prime p. Since we haven’t discussed 
local methods, we can’t even give a precise definition. A full discussion of 
the Hilbert symbol is given in Hasse [49, Part IJ, §§11-12, pp. 53-64] and 
Neukirch [80, §§III.5 and IV.9, pp. 50-55 and 110-112], and both references 
present a complete proof of the Strong Reciprocity theorem. In Exercise 
8.9 we will list the main properties of the Hilbert symbol. Q.E.D. 


To get a better idea of how Strong Reciprocity works, let’s apply it to cu- 
bic reciprocity. Here, n = 3 and K = Q(w), w = e*/3, and the only prime 
of Ox dividing 3 is A = 1—w. Thus, given nonassociate primes 7 and @ in 
Ox, Strong Reciprocity tells us that 


1 @\" _ (7,0 
alas ~ a} 
Hence, to prove cubic reciprocity, it suffices to show that 


0 
(8.15) m, 0 primary > (*), =1. 


The proof of cubic reciprocity is thus reduced to a purely local computation 
in the completion K) of K at A. Given the properties of the Hilbert symbol, 
(8.15) is not difficult to prove (see Exercise 8.9). Biquadratic reciprocity 
can be proved similarly, though the proof is a bit more complicated (see 
Hasse [49, Part II, §20, pp. 105-106]). This shows that class field theory 
encompasses all of the reciprocity theorems we’ve seen so far. 


B. The Cebotarev Density Theorem 


The Cebotarev Density Theorem will provide some very useful information 
about the Artin map. But first, we need to define the notion of Dirichlet 
density. 
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Let K be a number field, and let Px be the set of all finite primes of K. 
Given a subset S C Px, the Dirichlet density of S is defined to be 


F es N(p)~* 
ne) = A ons 1) 


provided the limit exists. The basic properties of the Dirichlet density 
are: 
(i) 6(Px) = 1. 

(ii) If S C T and 6(S) and 6(7) exist, then 6(S) < 6(7). 

(iii) If 6(S) exists, then 0 < 6(S) < 1. 

(iv) If S and 7 are disjoint and 6(S) and 6(7) exist, then 6(SUT) = 
5(S) + 6(T). 

(v) If S is finite, then 6(S) = 0. 

(vi) If 6(S) exists and 7 differs from S by finitely many elements, then 
6(7) = 6(S). 

To prove these properties, one first must study the Dirichlet zeta function 

¢x(s) of K. This function is defined by 


C(s)= > N@)* = TT G-Ne@y)™. 


aCOx pePx 


One can prove without difficulty that ¢x(s) converges absolutely for Re(s) 
>1 (see Janusz [62, §1V.4] or Neukirch [80, §V.6]). This implies that for 
any S C Px, the sum }),¢5 N(p)~* converges absolutely for Re(s) > 1 (see 
Exercise 8.10). A much deeper property of ¢x(s) is that it has a simple pole 
at s = 1, which enables one to prove 


N —S 
1= lim log(¢x (5 )) — lim i pery (P) 
s—i+ —log(s—1) s—-1+  —log(s —1) 


(see Janusz [62, §IV.4] or Neukirch [80, §V.6]). This proves (i), and it is now 
straightforward to prove (ii}-(vi) (see Exercise 8.10). 

There is one more property of the Dirichlet density which is sometimes 
useful. Let Px = {p € Px : N(p) is prime}. Px is sometimes called the 
degree 1 primes in K (recall that in general, N(p) = p’, where f is the 
inertial degree of p € p in the extension Q C K). Then one can prove that 


(8.16) 6(S) = 6(SN Px,1) 


whenever 6(S) exists (see Janusz [62, §1V.4] or Neukirch [80, §V.6]). 

Now let L be a Galois extension of K, possibly non-Abelian. If p is a 
prime of K unramified in L, then different primes $8 of L containing p 
may give us different Artin symbols ((L/K)/8). But all of the ((L/K)/) 
are conjugate by Corollary 5.21, and in fact they form a complete conjugacy 
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class in Gal(L/K) (see Exercise 5.12). Thus we can define the Artin symbol 
((L/K)/p) of p to be this conjugacy class in Gal(L/K). We can now state 
the Cebotarev Density Theorem: 


Theorem 8.17. Let L be a Galois extension of K, and let (a) be the conju- 
gacy class of an element o € Gal(L/K). Then the set 


S = {p€ Px :p is unramified in L and ((L/K)/p) = (o)} 
has Dirichlet density 


___|t7)l_ —__——itte)| 
Le |Gal(L/K)|  [L: K] 


Proof. See Janusz [62, Chapter V, Theorem 10.4] or Neukirch [80, Chapter 
V, Theorem 6.4]. Q.E.D. 


Notice that the set S of the theorem must be infinite since it has positive 
density (this follows from property (v) above). In particular, we get the 
following corollary for Abelian extensions: 


Corollary 8.18. Let L be an Abelian extension of K, and let m be a mod- 
ulus divisible by all primes that ramify in L. Then, given any element o € 
Gal(L/K), the set of primes p not dividing m such that ((L/K)/p) = 0 has 
density 1/[L: K] and hence is infinite. 


Proof. When Gal(L/K) is Abelian, the conjugacy class (a) is just the set 
{o}. Q.E.D. 


This corollary shows that the Artin map ®,/x,m:Jx(m) — Gal(L/K) is sur- 
jective in a very strong sense. 

An especially nice case is when K = Q and L = Q(¢,,), for here Corol- 
lary 8.18 gives a quick proof of Dirichlet’s theorem on primes in arithmetic 
progressions (the details are left to the reader—see Exercise 8.11). In §9 
we will apply these same ideas to study the primes represented by a fixed 
quadratic form ax? + bxy +cy?. 

Another application of Cebotarev Density concerns primes that split 
completely in a Galois extension K C L. Namely, if we apply Theorem 8.17 
to the conjugacy class of the identity element, we see that the primes in K 
for which ((L/K)/p) = 1 have density 1/[L : K]. However, from Corollary 
5.21, we know that 


(=) = 1 <> p splits completely in L. 
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Thus the primes that split completely in L have density 1/[L: K], and in 
particular there are infinitely many of them. The unexpected fact is that 
these primes characterize the extension K C L uniquely. Before we can 
prove this, we need to introduce some terminology. 

Given two sets S and T, we say that SCT if SC TU® for some fi- 
nite set 5, and S = 7 means that SC T and TCS. Also, given a finite 
extension K C L, we set 


Six = {p € Px : p splits completely in L}. 


We can now State our result: 


Theorem 8.19. Let L and M be Galois extensions of K. Then 
(i) LCM <= SM/K G SL/K- 
(ii) L=M <> SM /K = Sik. 


Proof. Notice that (ii) is an immediate consequence of (i). As for (i), we 
will prove the following more general result which applies when only one 
of L or M is Galois over K. This will be useful in §§9 and 11. 


Proposition 8.20. Let L and M be finite extensions of K. 
(i) If M is Galois over K, then LC M <=> Sy/x CSz/k:- 


(ii) If L is Galois over K, thn LCM <=> SM/K E SK, where SM/K is 
defined by 


SM/K = {p€ Px: p unramified in M, fxjp =1 
for some prime 8 of M}. 


Remark. \f M is Galois over K, then Sy /K Teduces to Sy /x (see Exercise 
8.12), and thus either part of Proposition 8.20 implies Theorem 8.19. 


Proof. We start with the proof of (ii). When L C M, it is easy to see that 
Suk C Six (see Exercise 8.12). Conversely, assume that Sy /x C Six, 
and Jet N be a Galois extension of K containing both L and M. By Galois 
theory, it suffices to show that Gal(N/M) C Gal(N/L). Thus, given o € 
Gal(N/M), we need to prove that oj, is the identity. 

By the Cebotarev Density Theorem, there is a prime p in K, unram- 
ified in N, such that ((N/K)/p) is the conjugacy class of o. Thus there is 
some prime $B of N containing p such that ((N/K)/%)=0. We claim 
that p E Su/K- To see why, let ’ = BNOy. Then, for ae Oy, we 
have 

a = 0(a)= aX) mod #8’. 
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The first congruence follows from o)y = 1, and the second follows by the 
definition of the Artin symbol (see Lemma 5.19). Thus Oy /B’ ~ Ox/p, so 
that f)p = 1. This shows that p € Su /K as Claimed. 

The Density Theorem implies that there are infinitely many such p’s. 
Thus our hypothesis Sy/x C Sy/x allows us to assume that p € Sx, ie. 
((L/K)/p) = 1. But Exercise 5.9 tells us that ((L/K)/p) = ((N/K)/®)|L. 
Since 0 = ((N/K)/B), we see that oj), = 1 as desired. 

To prove (i), first note LC M easily implies Sy/x C Syjx (see Exer- 
cise 8.12). To prove the other direction, let L’ be the Galois closure of L 
over K. It is a standard fact that a prime of K splits completely in L if 
and only if it splits completely in L’ (see Exercises 8.13-8.15 or Marcus 
[77, Corollary to Theorem 31]). This implies that Sz/x = Sy/x. Since M 
is Galois over K , we’ve already observed that Sy /K = Sujxk- Thus our hy- 
pothesis Sy x G Sik can be written SM /K C Si’ /K, So that by part (ii) we 
obtain L’ C M, which obviously implies L C M. This completes the proofs 
of Proposition 8.20 and Theorem 8.19. Q.E.D. 


Theorem 8.19 is closely related to Corollary 8.7. The reason is that if 
K CL is Abelian, then the set S;/x of primes that split completely is, up 
to a finite set, exactly the prime ideals in ker(®;/x.m), where m is any 
modulus divisible by all of the ramified primes. Thus we don’t need the 
whole kernel of the Artin map to determine the extension—just the primes 
in it will suffice! In particular, this shows that Theorem 8.19 is relevant 
to our question of which primes p are of the form x? + ny?. To see why, 
consider the situation of Theorem 5.1. Here, K is an imaginary quadratic 
field of discriminant dx = —4n (which means that n satisfies (5.2)). Then, 
by Theorem 5.26, 


p=x’+ny’ <> p splits completely in the Hilbert class field of K 


whenever p is an odd prime not dividing n. Thus Theorem 8.19 shows that 
the primes represented by x? + ny? characterize the Hilbert class field of 
Q(./—7) uniquely. In §9 we will give a version of this result that holds for 
arbitrary 7. 


C. Norms and Ideles 


Our discussion of class field theory has omitted several important topics. 
To give the reader a sense of what’s been left out, we will say a few words 
about norms and ideles. 

Given a finite extension K C L, there is the norm map Nz /x : L* — K*, 
and N;/x can be extended to a map of ideals 


Nuk : It — Ik 
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(see Janusz [62, §I.8]). The importance of the norm map is that it gives a 
precise description of the kernel of the Artin map. Specifically, let L be 
an Abelian extension of K, and Jet m be a modulus for which Px i(m) C 
ker(®;/x,m). Then an important part of the Artin Reciprocity Theorem 
states that 


(8.21) ker(®, /x,m) = Nz /KCUi(™)) Px 1(m) 


(see Janusz [62, Chapter V, Theorem 5.7]). Norms play an essential role in 
the proofs of the theorems of class field theory. 

Class field theory can be presented without reference to ideles (as we 
have done above), but the idelic approach has some distinct advantages. 
Before we can see why, we need some definitions. Given a number field K, 
the idele group Ix is the restricted product 


Ik = [[*K;, 
p 


where p runs over all primes of K, finite and infinite, and K, is the com- 
pletion of K at p. The symbol IT; means that Ix consists of all tuples (x) 
such that x, € Ox, for all but finitely many p. Ix is a locally compact topo- 
logical group, and the multiplicative group K* imbeds naturally in Ix as 
a discrete subgroup (see Neukirch [80, §V.2] for all of this). The quotient 
group 
Cx _ Ik / K* 

is called the idele class group. 

We can now restate the theorems of class field theory using ideles. Given 
an Abelian extension L of K, there is an Artin map 


®7/K : Cx aE 3 Gal(L/K) 


which is continuous and surjective. This is the idele theoretic analog of the 
Artin Reciprocity Theorem. Note that ker(®;/x) is a closed subgroup of 
finite index in Cx. There is also an idelic version of the Existence Theo- 
rem, which asserts that there is a 1-1 correspondence between the Abelian 
extensions of K and the closed subgroups of finite index in Cx. The nice 
feature of this approach is that it always uses the same group Cx, unlike 
our situation, where we had to vary the modulus m in Jx(m) as we moved 
from one Abelian extension to the next. 

Norms also play an important role in the idelic theory. Given an Abelian 
extension L of K, there is a norm map 


Ny/K *Cr — Cx, 


and the idelic analog of (8.21) is that the kernel of the Artin map © x: 
Cx — Gal(L/K) is exactly Nz /x(Ci). Thus the subgroups of Cx of finite 
index are precisely the norm groups Nz /x(Cz). 
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Standard references for the idele theoretic formulation of class field the- 
ory are Neukirch [80] and Weil [104]. Neukirch also explains carefully the 
relation between the two approaches to class field theory. 


D. Exercises 


8.1. Let K be an imaginary quadratic field, and let m be a modulus for 
K (which can be regarded as an ideal of Ox). We want to show that 
Pxi(m) has finite index in Ix(m). 

(a) Show that the map a++ aQx induces a well-defined homomor- 
phism 
@:(Ox/m)* — Ix(m)N Px/Px.(m), 


and then show that there is an exact sequence 
O% —+(Ox/m)* 25 Ig(m)N Px/Pxa(m) — 1. 


Conclude that Ix(m)N Px/Px1(m) is finite. Hint: see the proof 
of Theorem 7.24. 

(b) Adapt the exact sequence (7.25) to show that Ix(m)/Px 1(m) is 
finite (recall that C(Ox) is finite by §2 and Theorem 7.7). 


8.2. This problem is concerned with the Artin map of the cyclotomic ex- 
tension Q C Q(Gn), where Gy = e?7/™ . We will assume that m > 2. 
(a) Use Proposition 5.11 to prove that all finite ramified primes of 
this extension divide m. Thus the Artin map ®,,.. is defined. 
(b) Show that ®,,.. : Ig(moo) > Gal(Q(Gm)/Q) ~ (Z/mZ)* is as de- 
scribed in (8.3). Hint: use Lemma 5.19. 
(c) Conclude that ker(®,,..) = Pgi(moo). 


8.3. Let QC Q(Gn) be as in the previous problem, and assume that m > 
2 


(a) Show that RN Q(Gn) = Q(cos(27/m)), and then conclude that 
[Q(cos(2m/m)): Q] = (1/2)o(m). 

(b) Compute the Artin map ©, : Jg(m) — Gal(Q(cos(27/m))/Q) ~ 
(Z/mZ)* /{+1}. Hint: use the previous exercise. 

(c) Show that ker(®,,) = Pgi(m). 


8.4. Let K C L be an Abelian extension, and Jet m be a modulus for which 
the Artin map ®,, is defined. If n is another modulus and m | n, prove 
that 


Pxi(m) C ker(®@m) > Pxi(m) C ker(@q). 


8.5. 


8.6. 


8.7. 


8.8. 
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Prove that the conductor of the cyclotomic extension Q C Q(Gn) is 
given by 


1 m<2 
f(Q(Gm)/Q) = ¢ (m/2)oo m =2n, n> 1 odd 
moo otherwise. 


Hint: when m > 2, use Theorem 8.5 and Exercise 8.2 to show that the 
conductor is of the form noo for some n dividing m. Then use Corol- 
lary 8.7 to show that Q(Gn) C Q(¢,), which implies that d(m) | ¢(n). 
The formula for f(Q(¢m)/Q) now follows from elementary arguments 
about the Euler @¢-function. 


This exercise is concerned with conductors. 

(a) Given a modulus m for a number field K, let Km denote the 
ray class field defined in the text. If L is an Abelian extension of 
K, then show that the conductor f(L/K) is the greatest common 
divisor of all moduli m for which L C Ky. 

(b) If Z is an Abelian extension of Q, let m be the smallest positive 
integer for which L C Q(¢m) (note that m exists by the Kronecker— 
Weber Theorem). Then show that 


m if LCR 


moo otherwise. 


((L/Q) = { 


In this exercise we will fill in some of the details omitted in the 
proof of quadratic reciprocity given in Theorem 8.12. Let p be an 
odd prime. 

(a) If K CL is an Abelian extension such that Gal(L/K) is a gener- 
alized ideal class group for the modulus m of K, then prove that 
the same is true for any intermediate field K CM CL. 

(b) If K is a quadratic field which ramifies only at p, then use Corol- 
lary 5.17 to show that K = Q(,/p*), p* = (-1)?-D/ p. 

(c) Show that the map a+ aZ induces an isomorphism (Z/pZ)* > 
Te(pco)/Pai(poo). 


This exercise will adapt the proof of Theorem 8.12 to prove (2/p) = 

(-1)?"-D/8, 

(a) Let H = {+1}Pe@1(800). Show that via the Existence Theorem, 
H corresponds to Q(V/2). Hint: using the arguments of Theorem 
8.12 and part (b) of Exercise 8.7, show that H corresponds to one 
of Q(i), Q(V2) or Q(/—2). Then use —1€ H to show that the 
corresponding field must be real. 
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(b) Construct an isomorphism (Z/8Z)* — Ig(800)/Pe@,1(800), and 
then use Weak Reciprocity to show that (2/-) induces a well-de- 
fined homomorphism on (Z/8Z)* whose kernel is {+1}. 


(c) Show that (2/p) = (-1)?°-D/8, 


In this exercise we will use Strong Reciprocity and the properties of 
the Hilbert symbol to prove cubic reciprocity. We will assume that 
the reader is familiar with p-adic fields. To list the properties of the 
Hilbert symbol, let K be a number field containing a primitive nth of 
unity, and let p be a prime of K. The completion of K at p will be 
denoted K,. Then the Hilbert symbol (a,(/p)n is defined for a, 8 € 
Ky and gives a map 


(=), :Ky x Kj — pn, 


where jl, is the group of nth roots of unity. The Hilbert symbol has 
the following properties: 

(i) (aa',8/P)n = (2,8 /P)n(@', B/P)n- 

(ii) (@',BB'/P)n = (2,8/P)n(@, B'/P)n- 
(iii) (a,8/P)n = (B,aP)_'. 
(iv) (a,-a/p)n = 1. 

(v) (a,1—a/p)n = 1. 
For proofs of these properties of the Hilbert symbol, see Neukirch 
[80, §111.5]. 
Now let’s specialize to the case n = 3 and K = Q(w), w =e?™/3, 

As we saw in (8.15), Strong Reciprocity shows that cubic reciprocity is 
equivalent to the assertion 


m,@ primary in Ox > (=); = j 
where A = 1—w. Recall that 7 primary means that 7 = +1 mod 30x. 
In 84 we saw that replacing by —7 doesn’t affect the statement 
of cubic reciprocity, so that we can assume that 7 = 6=1 mod \?Ox 
(note that \* and 3 differ by a unit in Ox). Let Ky be the completion 
of K at A, and let ©, be the valuation ring of K,. We will use the 
properties of the cubic Hilbert symbol to show that 


a, 3 =1mod 70, > (7), = 1, 


and then cubic reciprocity will be proved. 


8.10. 


8.11. 


8.12. 
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(a) If a=1mod ‘OQ, then prove that a=u3> for some uE Qj. 
Hint: if a =u3 mod \"O, for n> 4, then show that a = (up + 
ad"-*)3 mod "+10, for an appropriately chosen a € Oy. 

(b) If a € OF and a=a! mod 440), then prove that for any 6 € K¥ 


aB\ _ (a',B 
i ae ae ae 
Hint: use (a) and property (i) above. Remember that (a, 3/A)3 is 
a cube root of unity. 
(c) Now assume that a = § =1mod A70j, and write a=1+a)?, 


a €O,. Then first, apply property (v) to 1+ a", and second, 
apply (b) to 1+ aBd* = 1+ a? mod A4O). This proves that 


(- + aoe 
(a(S. 


A 


From here, properties (ii) and (v) easily imply that (a, 8/A)3 = 1, 
which completes the proof of cubic reciprocity. 


In this exercise we will study the properties of the Dirichlet density. 

(a) Assuming that Cx(5S) = )iaco, N(a)~* converges absolutely for 
Re(s) > 1, show that for SC Px, the sum }),-5N(p)~* also 
converges absolutely for Re(s) > 1. 

(b) Use (a) to prove that properties (ii}(iv) of the Dirichlet density 
follow from (i) and the definition. 


Apply the Cebotarev Density Theorem to the cyclotomic extension 
Q C Q(Gn) to show that the primes in a fixed congruence class in 
(Z/mZ)* have Dirichlet density 1/¢(m). This proves Dirichlet’s the- 
orem that there are infinitely many primes in an arithmetic pro- 
gression where the first term and common difference are relatively 
prime. 


Let M be a finite extension of a number field K, and let Sy x be 
the set of primes of Ox defined in Proposition 8.20. 


(a) If M is Galois over K , then show that Sy /x equals the set Sy/x 
of Theorem 8.19. 


(b) If ZL is a Galois extension of K and LCM, then show that 
Su/K CSzjk- 

(c) If LCM are finite extensions of K, then show that Sy/x C 
SL/K- 
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Let K CN be a Galois extension, and let $8 be a prime of Oy. 

Set p= PNOx, e =e), and f = fxgi,p. If Dg C Gal(N/K) is the 

decomposition group of 8, we will denote the fixed field of Dg by 

Ng. From Proposition 5.10, we know that |[Dy| = ef, and Galois 

theory tells us that [V: Ng] = |Dx|. Let PB’ = BN On,. 

(a) Prove that ey), = fy}, = 1. Hint: by Proposition 5.10, the map 
Dg- G is surjective, where G is the Galois group of Ox/p Cc 
On/B. Use Ox/p C Ong /P' C On /P, and remember that the 
e’s and f’s are multiplicative (see Exercise 5.15). 

(b) Given an intermediate field KCMCN, let Py = BNOy. 
Then prove that 


Cgulp =Spylp = 1 —> MCNg. 


Hint: if M C Ng, then apply (a). Conversely, show that the com- 
positum NyM is the fixed field for the decomposition 
group of $8 in Gal(N/M). By applying the result of (a) to MC 
NgyM and computing degrees, one sees that NyM = M, which 
implies M C Ng. 


Let L and M be finite extensions of a number field K, and let p be 
a prime of K that splits completely in L and M. Then prove that 
p splits completely in LM. Hint: let N be a Galois extension of K 
containing both L and M, and let $8 be a prime of N containing 
p. From Exercise 8.13 we get the intermediate field K C Ng CN. 
Then use part (b) of that exercise to show that L and M lie in Ng, 
which implies LM C Ng. 


Let L be a finite extension of a number field K, and let L’ be the 

Galois closure of L over K . The goal of this exercise is to prove that 

Si/K = Sy/K~ By part (c) of Exercise 8.12, we have Sy: /xK C Szyk, 

so that it suffices to show that a prime of K that splits completely in 

L also splits completely in L’. 

(a) Let 0: LC be an embedding which is the identity on K, and 
let p be an ideal of K which splits completely in L. Then prove 
that p splits completely in o(L). 

(b) Since L' is the compositum of the o(L)’s, use the previous ex- 
ercise to show that p splits completely in L’. 


Let K CM be a finite extension of number fields. Then prove that 
K CM is a Galois extension if and only if Sy/;x = Sq/x. Hint: one 
implication is covered in part (a) of Exercise 8.12, and the other 
implication is an easy consequence of Proposition 8.20. 
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§9. RING CLASS FIELDS AND p = x? + ny’? 


In Theorem 5.1 we used the Hilbert class field to characterize p = x? + ny? 
when n is a positive, squarefree and n £3 mod 4. In 84, we also proved that 
for an odd prime p, 

pa ep aprons ced 

has an integer solution 
Pe a ee  peabaracnge ea | 
has an integer solution. 

These earlier results follow the format of Theorem 5.1 (note that both ex- 
ponents are class numbers), yet neither is a corollary of the theorem, for 
27 and 64 are not squarefree. In §9 we will use the theory developed in 
§§7 and 8 to overcome this limitation. Specifically, given an order O in 
an imaginary quadratic field K, we will construct a generalization of the 
Hilbert class field called the ring class field of O. Then, using the ring class 
field of the order Z[,/—n], where n > 0 is now arbitrary, we will prove a 
version of Theorem 5.1 that holds for all n (see Theorem 9.2 below). This, 
of course, is the main theorem of the whole book. The basic idea is that the 
criterion for p = x? + ny” is determined by a primitive element of the ring 
class field of Z[,/—n]. To see how this works in practice, we will describe 
the ring class fields of Z[/—27] and Z[V—64], and then Theorem 9.2 will 
give us class field theory proofs of Euler’s conjectures for p = x? + 27y? or 
x? + 64y”?. To complete the circle of ideas, we will then explain how class 
field theory implies those portions of cubic and biquadratic reciprocity used 
in §4 in our earlier discussion of x? + 27y? and x? + 64y?. 

The remainder of the section will explore two other aspects of ring class 
fields. We will first use the Cebotarev Density Theorem to prove that a 
primitive positive definite quadratic form represents infinitely many prime 
numbers. Then, in a different direction, we will give a purely field-theoretic 
characterization of ring class fields and their subfields. 


A. Solution of p = x? + ny? for all n 


Before introducing ring class fields, we need some notation. If K is a num- 
ber field, an ideal m of Ox can be regarded as a modulus, and in §8 we 
defined the ideal groups Jx(m) and Pxi1(m). In this section, m will usually 
be a principal ideal aOx, and the above groups will be written Ix(a) and 
P K,1(a). 

To define a ring class field, let O be an order of conductor f in an 
imaginary quadratic field K. We know from Proposition 7.22 that the ideal 
class group C(Q) can be written 


(9.1) C(O) = Ik(f)/Px,27) 
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(recall that Px,z(f) is generated by the principal ideals a@Ox, where a= 
amod f Ox for some integer a with gcd(a, f) = 1). Furthermore, in §8 we 


saw that 
Pxa(f) C Px,z(f) C Ix(f), 


so that C(O) is a generalized ideal class group of K for the modulus fOx 
(see (8.1)). By the Existence Theorem (Theorem 8.6), this data determines 
a unique Abelian extension L of K, which is called the ring class field of 
the order ©. The basic properties of the ring class field L are, first, all 
primes of K ramified in L must divide fOx, and second, the Artin map 
and (9.1) give us isomorphisms 


C(O) = Ik(f)/Px,z(f) = Gal(L/K). 


In particular the degree of L over K is the class number, ie., [L: K] = 
h(©). For an example of a ring class field, note that the ring class field 
of the maximal order Ox is the Hilbert class field of K (see Exercise 9.1). 
Later in this section we will give other examples of ring class fields. 

We can now state the main theorem of the book: 


Theorem 9.2. Let n > 0 be an integer. Then there is a monic irreducible poly- 
nomial fn(x) € Z[x] of degree h(—4n) such that if an odd prime p divides 
neither n nor the discriminant of f,(x), then 


(—n/p)=1and f,(x)=0 mod p 


p=xr+ny? <> 
has an integer solution. 


Furthermore, f,(x) may be taken to be the minimal polynomial of a real 
algebraic integer a for which L = K(q) ts the ring class field of the order 
Z[{./—n] in the imaginary quadratic field K = Q(./—n). 

Finally, if fy(x) ts any monic integer polynomial of degree h(—4n) for 
which the above equivalence holds, then f,(x) is irreducible over Z and is the 
minimal polynomial of a primitive element of the ring class field L described 
above. 


Remark. This theorem generalizes Theorem 5.1, and the last part of the 
theorem shows that knowing f,(x) is equivalent to knowing the ring class 


field of Z[,/—n]. 


Proof. Before proceeding with the proof, we will first prove the following 
general fact about ring class fields: 


Lemma 9.3. Let L be the ring class field of an order O in an imaginary 
quadratic field K. Then L is a Galois extension of Q, and its Galois group 
can be written as a semidirect product 


Gal(L/Q) ~ Gal(L/K) x (Z/2Z) 
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where the nontrivial element of Z/2Z acts on Gal(L/K) by sending o to its 
inverse a~}. 


Proof. In the case of the Hilbert class field, this lemma was proved in §6 
(see the discussion following (6.3)). To do the general case, we first need to 
show that 7(L) = L, where 7 denotes complex conjugation. Let m denote 
the modulus f Ox, and note that T(m) = m. Since ker(®z/x,m) = Px,z(f), 
an easy Computation shows that 


ker(®-(1)/K,m) = T(ker(®zx,m)) = T(Px,z(f)) = Px,z(f) 


(see Exercise 9.2), and thus ker(®,(1)/x,m) = ker(®z/x,m). Then 7(L) = L 
follows from Corollary 8.7. 

As we noticed in the proof of Lemma 5.28, this implies that L is Galois 
over Q, so that we have an exact sequence 


1 —> Gal(L/K) — Gal(L/Q) — Gal(K /Q) (x Z/2Z) — 1. 


Since T € Gal(L/Q), Gal(L/Q) is the semidirect product Gal(L/K) x (Z/ 
2Z), where the nontrivial element of Z/2Z acts by conjugation by 7. How- 
ever, for a prime p of K , Lemma 5.19 implies that 


) = (3) 

T = == 

p T(p) P 

(see Exercise 6.3). Thus, under the isomorphism Ig(f)/Px,z(f) ~ Gal(L/ 
K), conjugation by 7 in Gal(L/K) corresponds to the usual action of 7 on 
Ix(f). But if a is any ideal in Ix(f), then aa = N(a)Ox lies in Px z(f) 
since N(a) is prime to f. Thus @ gives the inverse of a in the quotient 
Ik(f)/Px,z(f), and the lemma is proved. Q.E.D. 


We can now proceed with the proof of Theorem 9.2. Let L be the ring 
class field of Z[,\/—n]. We start by relating p = x? + ny” to the behavior of 
pin L: 


Theorem 9.4, Let n > 0 be an integer, and L be the ring class field of the 
order Z[./—n] in the imaginary quadratic field K = Q(,/—n). If p is an odd 
prime not dividing n, then 


p=x?+ny* <> p splits completely in L. 


Proof. Let O = Z[./—n]. The discriminant of O is —4n, and then —4n = 
f2dx by (7.3), where f is the conductor of ©. Let p be an odd prime not 
dividing n. Then p/ f7dx, which implies that p is unramified in K . We will 
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prove the following equivalences: 
p=x*+ny? <> pOx = pp, p# p, and p=aOx, ac O 
<> pOx = pp, p#p, and pe Px z(f) 
<=> pOk = pp, p#, and ((L/K)/p) = 1 
<> pOx = pp, p # p, and p splits completely in L 
<=> p splits completely in L, 


and Theorem 9.4 will follow. 

To prove the first equivalence, suppose that p = x? + ny* = (x + /—ny) 
x(x — /—ny). If we set p = (x + /—ny)Ox, then pOx = pp is the prime 
factorization of pOx in Ox. Note that x + ./—ny € O, and p ¥ p since p 
is unramified in K. Conversely, if pOx = pp, where p = (x + /—ny)Ox, 
then it follows easily that p = x? + ny?. 

Since p/f, the second equivalence follows from Proposition 7.22, and 
the next two equivalences are equally straightforward: the isomorphism 
Ix(f )/Px,z(f ) ~ Gal(L/K) given by the Artin map shows that p € Px,z(f) 
if and only if ((L/K)/p) = 1, and then Lemma 5.21 shows that ((L/K)/p) = 
1 if and only if p splits completely in L. Finally, recall from Lemma 9.3 that 
L is Galois over Q. Thus, the proof of the last equivalence is identical to 
the proof of the last equivalence of (5.27). This completes the proof of 
Theorem 9.4. Q.E.D. 


The next step is to prove the main equivalence of Theorem 9.2. By 
Lemma 9.3, the ring class field L is Galois over Q, and thus Proposition 
5.29 enables us to find a real algebraic integer a such that L = K(a). Let 
fn(x) € Z[x] be the minimal polynomial of a over K. Since O has discrim- 
inant —4n, the degree of f,(x) is [L: K] = h(O) = h(—4n). Then, combin- 
ing Theorem 9.4 with the last part of Proposition 5.29, we have 


p=x?+ny* <> p splits completely in L 
(—n/p) = 1 and f,(x) =0 mod p 


has an integer solution, 


whenever p is an odd prime dividing neither n nor the discriminant of 
fn(x). This proves the main equivalence of Theorem 9.2. 

The final part of the theorem is concerned with the “uniqueness” of 
fn(x). Of course, there are infinitely many real algebraic integers which are 
primitive elements of the extension K C L, and correspondingly there are 
infinitely many f,(x)’s. So the best we could hope for in the way of unique- 
ness is that these are all of the possible f,,(x)’s. This is almost what the last 
part of the statement of Theorem 9.2 asserts—the f,,(x)’s that can occur 


B. THE RING CLASS FIELDS OF Z[\/—27] AND Z[,\/—64] 183 


are exactly the monic integer polynomials which are minimal polynomials 
of primitive elements (not necessarily real) of L over K. 

To prove this assertion, let f,(x) be a monic integer polynomial of de- 
gree h(—4n) which satisfies the equivalence of Theorem 9.2. Then let g(x) € 
K[x] be an irreducible factor of f,(x) over K, and let M = K(a) be the 
field generated by a root of g(x). Note that a is an algebraic integer. If we 
can show that LC M, then 


h(—4n) = [L: K] <[M: K] = deg(g(x)) < deg(fn(x)) = h(—4n), 


which will prove that L = M = K(q) and that f,(x) is the minimal polyno- 
mial of a over K (and hence over Q). 

It remains to prove LC M. Since L is Galois over Q by Lemma 9.3, 
Proposition 8.20 tells us that L C M if and only if SM/Q e Si/q, where: 


Si/q = {p prime: p splits completely in L} 
Su/q = {p prime: there is an ideal B of M with fy), = 1}. 


Let’s first study S;;@. By Theorem 9.4, this is the set of primes p 
represented by x*+ ny’. Since f,(x) satisfies the equivalence of Theo- 
rem 9.2, it follows that S;/@ is (with finitely many exceptions) the set of 
primes p which split completely in K and for which f,(x) =0 mod p has 
a solution. 

To prove Sy/@ C Sz/@, Suppose that p € Sy@- Then fg)p = 1 for some 
prime 8 of M, and if we set p = BN Ox, then 1= fx), = fpipfp|p- Thus 
fpip = 1, which implies that p splits completely in K (since it’s unrami- 
fied). Note also that f,(x) = 0 mod $ has a solution in Oy since a € Oy 
and g(a) = fn(a) = 0. But fg), = 1 implies that Z/pZ ~ Ou /#, and hence 
fn(x) = 0 mod p has an integer solution. By the above description of Sz/@, 
it follows that p € S,/q@- This proves Su /@ © SL/@ and completes the proof 
of Theorem 9.2. Q.E.D. 


There are also versions of Theorems 9.2 and 9.4 that characterize which 
primes are represented by the form x*+ xy +((1—D)/4)y*, where D= 
1 mod 4 is negative (see Exercise 9.3). 


B. The Ring Class Fields of Z[/—27] and Z[V—64] 


Theorem 9.2 shows how the ring class field solves our basic problem of 
determining when p = x* + ny’, and the last part of the theorem points 
out that our problem is in fact equivalent to finding the appropriate ring 
class field. To see how this works in practice, we will next use Theorem 9.2 
to give new proofs of Euler’s conjectures for when a prime is represented 
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by x2 + 27y? or x? + 64y” (proved in §4 as Theorems 4.15 and 4.23). The 
first step, of course, is to determine the ring class fields involved: 


Proposition 9.5. 

(i) The ring class field of the order Z[./—27|C K = Q(/-3) is L= 
K (V2). 

(ii) The ring class field of the order Z[,/—64] C K = Q(i) is L = K(W2). 


Proof. To prove (i), let L be the ring class field of Z[/—27]. Although L 
is defined abstractly by class field theory, we still know the following facts 
about L: 

(i) L is a cubic Galois extension of K = Q(V—3) since [L:K]= 
h(—4- 27) = 3. 

(ii) L is Galois over Q with group Gal(L/Q) isomorphic to the symmet- 
ric group $3. This follows from Lemma 9.3 since $3 is the semidirect 
product (Z/3Z) x (Z/2Z) with Z/2Z acting nontrivially. 

(iii) All primes of K that ramify in L must divide 6Ox. To see this, note 
that Z[./—27] = Z[3V/—3] is an order of conductor 6 (since Ox = 
Z{(—1 + /—3)/2]), so that L corresponds to a generalized ideal class 
group for the modulus 6Ox. By the Existence Theorem (Theorem 8.6), 
the ramification must divide the modulus. 

We will show that only four fields satisfy these conditions. To see this, first 

note that K contains a primitive cube root of unity, and hence any cubic 

Galois extension of K is of the form K(\/u) for some ue K. (This is a 

standard result of Galois theory—see Artin [2, Corollary to Theorem 25].) 

However, the fact that Gal(L/Q) ~ $3 allows us to assume that wu is an 

ordinary integer. More precisely, we have: 


Lemma 9.6. If M is a cubic extension of K = Q(/—3) with Gal(M /Q) 
~ $3, then M = K(x/m) for some cubefree positive integer m. 


Proof. The idea is to modify the classical proof that M = K(\/u) for some 
ué€ K. We know that M is Galois over Q and that complex conjugation 7 is 
in Gal(M /Q). Furthermore, if o is a generator of Gal(L/K) ~ Z/3Z, then 
Gal(L/Q) ~ S3 implies that rot = o!. 

By Proposition 5.29, we can find a real algebraic integer a such that 
M = K(a). Then define uj € M by 

uj =atwo (a) +w%o-*(a), i = 0,1,2. 

The u;’s are algebraic integers satisfying o(u;) = w'u;, and note that T(u;) = 
u; since @ is real and trot = o~!. Thus the w;’s are all real. Then up is fixed 


by both o and 7, which implies that up € Z. Similar arguments show that 
u? and u3 are also integers. If wu; #0, we claim that M = K(u1). This is 
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easy to see, for [M: K] = 3, and thus M # K(u) could only happen when 
u, € K. Since uy is real, this would force uw; to be an integer, which would 
contradict o(u,) = Wu, and u, #0. This proves our claim, and if we set 
m = u2 € 7, it follows that M = K(u1) = K(./m). We may assume that m 
is positive and cubefree, and we are done. 

If uz #0, we are done by a similar argument. The remaining case to 
consider is when uy; = uz = 0. However, in this situation a simple applica- 
tion of Cramer’s rule shows that our original a would lie in K and hence 
be rational (since we chose a to be real in the first place). The details of 
this argument are left to the reader (see Exercise 9.4), and this completes 
the proof of Lemma 9.6. Q.E.D. 


Once we know L = K(,/m) for some cubefree integer m, the next step 
is to use the ramification of K C L to restrict m. Specifically, it is easy to 
show that any prime of Ox dividing m ramifies in K(\/m) (see Exercise 
9.5). However, by (iii) above, we know that all ramified primes divide 60x, 
and consequently 2 and 3 are the only integer primes that can divide m. 
Since m is also positive and cubefree, it must be one of the following eight 
numbers: 


2, 3, 4, 6, 9, 12, 18, 36, 
and this in turn implies that L must be one of the following four fields: 
(9.7) K(V2), K(V3), K(W6), K(v12) 


(see Exercise 9.6). All four of these fields satisfy conditions (i}(iii) above, 
so that we will need something else to decide which one is the ring class 
field L. 

Surprisingly, the extra ingredient is none other than Theorem 9.2. More 
precisely, each field listed in (9.7) gives a different candidate for the poly- 
nomial f27(x) that characterizes p = x* + 27y?, and then numerical com- 
putations can determine which one is the correct field. To illustrate what 
this means, suppose that L were K(/3), the second field in (9.7). This 
would imply that f7(x) = x? — 3, which has discriminant —3° (see Exercise 
9.7). If Theorem 9.2 held with this particular f>7(x), then the congruence 
x3 =3 mod 31 would have a solution since 31 = 2? + 27-1? is of the form 
x? + 27y*. Using a computer, it is straightforward to show that there are 
no solutions, so that K(W3) can’t be the ring class field in question. Similar 
arguments (also using p = 31) suffice to rule out the third and fourth fields 
given in (9.7) (see Exercise 9.8), and it follows that L = K(V2) as claimed. 

The second part of the proposition, which concerns the ring class field 
of the order Z[./—64] C K = Q(i), is easier to prove than the first, for in 
this case one can show that K(v2) is the unique field satisfying the analogs 
of conditions (i}—(iii) above (see Exercise 9.9). Q.E.D. 
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Another example of a ring class field is given in Exercise 9.10, where we 
will show that the field K(/3) from (9.7) is the ring class field of the order 
Z[9w] of conductor 9 in K = Q(/—3). 

If we combine Theorem 9.2 with the explicit ring class fields of Proposi- 
tion 9.5, then we get the following characterizations of when p = x? + 27y? 
and p = x? + 64y? (proved earlier as Theorems 4.15 and 4.23): 


Theorem 9.8. 
(i) If p > 3 is prime, then 
p=1mod 3 and x3 =2 mod p 


p= x2 + 2Ty? => | ; 
has an integer solution. 


(ii) If p is an odd prime, then 
p=1mod4and x4=2 mod p 


p=x?+64y? <> ; 
has an integer solution. 


Proof. By Proposition 9.5, the ring class field of Z[/—27] is L = K(V2), 
where K = Q(/—3). Since ¥2 is a real algebraic integer, the polynomial 
fox(x) of Theorem 9.2 may be taken to be x*— 2. Then the main equiva- 
lence of Theorem 9.2 is exactly what we need, once once checks that the 
condition (—27/p) = 1 is equivalent to the congruence p = 1 mod 3. The 
final detail to check is that the discriminant of x3 — 2 is —2? - 33 (see Exer- 
cise 9.7), so that the only excluded primes are 2 and 3, and then (i) follows. 
The proof of (ii) is similar and is left to the reader (see Exercise 9.11). 

Q.E.D. 


Besides allowing us to prove Theorem 9.8, the ring class fields deter- 
mined in Proposition 9.5 have other uses. For example, if we combine them 
with Weak Reciprocity from §8, we then get the following partial results 
concerning cubic and biquadratic reciprocity: 


Theorem 9.9. 
(i) If a primary prime x of Z[w], w = e2™/°, is relatively prime to 6, then 


(ii) If p=1 mod 4 is prime and p =a’ +b’, then x =a + bi is prime in 


Z[i], and 
(=) = ja>/2 
a }4 " 
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Remark. Notice that these are exactly the portions of cubic and biquadratic 
reciprocity used in our discussion of p = x? + 27y” and x? + 64y? in §4 (see 
Theorems 4.15 and 4.23). 


Proof. We will prove (i) and leave the proof of (ii) as an exercise (see 
Exercise 9.12). The basic idea is to combine Weak Reciprocity (Theorem 
8.11) with the explicit description of the ring class field given in Proposi- 
tion 9.5. 

If K = Q(w), then Ox is the ring Z[w] from §4. Thus L = K(1/2) is the 
ring class field of the order of conductor 6, and hence corresponds to a sub- 
group of Ix(6) containing Px (6). This shows that the conductor f divides 
60x. Then Weak Reciprocity tells us that the cubic Legendre symbol (2/-)3 
induces a well-defined homomorphism 


Ix (6)/Px,s(6) — ps 


where p3 is the group of cube roots of unity. However, the map sending 
a € Ox to the principal ideal a@Ox induces a homomorphism 


(Ox /6Ox)* — Ix(6)/Px,z(6) 


(this is similar to what we did in §7—see part (c) of Exercise 9.21). Com- 
bining these two maps, the Legendre symbol (2/-)3 induces a well-defined 
homomorphism 


(9.10) (Ox /6Ox)* — pis. 


Recall that 7 is primary by assumption, which means that 7 = +1 mod 
3Ox. Replacing 7 by —7 affects neither (2/7)3 nor (7/2)3, so that we can 
assume 7 = 1 mod 30x. Now consider the isomorphism 


(9.11) (Ox /6Ox)* ~ (Ox /2Ox)* x (Ox /3Ox)*. 


By (9.10), (2/-)3 is a homomorphism on (Ox /6Ox )*, and the condition 7 = 
1 mod 30x means we are restricting this homomorphism to the subgroup 
(Ox /2Ox)* x {1} relative to (9.11). But the cubic Legendre symbol (-/2)3 
can also be regarded as a homomorphism on this subgroup, and we thus 
need only show that these homomorphisms are equal. 

To prove this, first note that (Ox /2Ox)* x {1} is cyclic of order 3 (Ox / 
2Ox is a field with four elements), and the class of 0 = 1+ 3w in (Ox /6Ox)* 
is a generator. Thus, to show that the two homomorphisms are equal, it suf- 


fices to prove that 
2\ (8 
a3 = \a)s 


Using (4.10), this is straightforward to check—see Exercise 9.12 for the de- 
tails. Theorem 9.9 is proved. Q.E.D. 
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C. Primes Represented by Positive Definite Quadratic Forms 


As an application of ring class fields, we will prove the classic theorem 
that a primitive positive definite quadratic form ax? + bry + cy? represents 
infinitely many prime numbers. The basic idea is to compute the Dirichlet 
density (in the sense of §8) of the set of primes represented by ax? + bxy + 
cy’, for once we show that the density is positive, there must be infinitely 
many primes represented. Here is the precise statement of what we will 
prove: 


Theorem 9.12. Let ax* +bxy +cy* be a primitive positive definite qua- 
dratic form of discriminant D <0, and let S be the set of primes represented 
by ax? + bxy +cy*. Then the Dirichlet density 5(S) exists and is given by the 
formula 


1 >. 
——— if ax? +bxy +cy” is properly equivalent to its opposite 
2h(D) 

(5) = 
iD) otherwise. 


In particular, ax? + bxy + cy? represents infinitely many prime numbers. 


Proof. Let O be the order of the discriminant D, and let K = Q(/D). By 
(7.3), we have D = f?dx, where f is the conductor of O. As in the state- 
ment of the theorem, let S = {p prime: p = ax? + bxy + cy}. We need to 
compute the Dirichlet density of S. 

The first step is to relate S to the generalized ideal class group [x (f)/ 
Px z(f). From Theorem 7.7 we have the isomorphism C(D)~x C(Q), so 
that the class [ax? + bxy +cy*]€ C(D) corresponds to the class [ag] € 
C(Q) for some proper O-ideal ag. Then part (iii) of Theorem 7.7 tells us 
that 


(G13) S = {p prime: p = N(b), 6 € [ao]}. 


We need to state this in terms of the maximal order Ox. By Corollary 7.17 
we may assume that ag is prime to f, and from here on we will consider 
only primes p not dividing f. Under the map a+ aOx, we know that 6 € 
[ag] € C(O) corresponds to bOx € [a9Ox] € Ie(f)/Pxz(f) (Proposition 
7.22), and furthermore, 6 and 6 = bOx have the same norm when prime to 
f (Proposition 7.20). Thus (9.13) implies 


S = {p prime: p/f, p = N(6), 6 € [aoOx}}. 


C. PRIMES REPRESENTED BY POSITIVE DEFINITE QUADRATIC FORMS 189 


Since p is prime, the equation p = N(b) forces 6 to be prime, so that this 
description of S can be written 


(9.14) S = {p prime: p{f, p = N(p), p prime, p € [aoOx]}. 


If L is the ring class field of O, then Artin Reciprocity gives us an iso- 
morphism 


(9.15) Ix(f)/Px2(f) ~ Gal(L/K). 


Under this isomorphism, the class of aj90x maps to an element oo € 
Gal(L/K), which we can regard as an element of Gal(L/Q). Letting (a0) 
denote its conjugacy class in Gal(L/Q), we claim that 


(9.16) S = { p prime: p unramified in L, (=) = (on). 


The right hand side of (9.16) will be denoted S’, so that we must prove 
S=S'. 

To show S'C S, let p ES’. Thus ((L/Q)/p) = (a0), which means that 
((L/Q)/8) = oo for some prime of L containing p. Then p = PN Ox is 
a prime of K containing p, and we claim that p = N(p). To see this, note 
that for any a€ OL, 


(9.17) do(a) =a? mod $ 


since oo = ((L/Q)/8). But oo € Gal(L/K), so that when @ € Ox, the above 
congruence reduces to 
a =a? mod p. 


This implies Ox /p ~ Z/pZ, and N(p) = p follows. This fact and (9.17) then 
imply that oo is the Artin symbol ((L/K)/p). Since [ag9Ox] € Ik(f)/Px,z(f) 
corresponds to go € Gal(L/K) under the isomorphism (9.15), it follows that 
p is in the class of ajOx. Then (9.14) implies that p € S, at least when p/f, 
and S’ C S follows. The opposite inclusion is straightforward and is left to 
the reader (see Exercise 9.14). This completes the proof of (9.16). 
From (9.16), the Cebotarev Density Theorem shows that S has Dirichlet 
density a0)| 
00 

nO) Eel: 
However, since 09 € Gal(L/K), Lemma 9.3 implies that (09) = {00,05 '} 
(see Exercise 9.15). Since [L: Q] = 2h(D), we see that 

1 


2h(Dy’ oo has order < 2 


6(S) = 


h(D) 9 otherwise. 
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Now go has order < 2 if and only if ax? + bxy + cy” has order <2 in C(D), 
and this last statement means that ax? + bxy + cy” is properly equivalent 
to its opposite. This completes the proof of Theorem 9.12. Q.E.D. 


As an example of what the theorem says, consider forms of discriminant 
—56. The class number is 4, and we know the reduced forms from §2. Then 
Theorem 9.12 implies that 


6({p prime: p = x? + 14y*}) = 
6({p prime: p = 2x7 + Ty"}) = 


Coj— Ol 


6({p prime: p = 3x? 4+2xy + S5y*}) =}. 


Notice that these densities sum to 1/2, which is the density of primes for 
which (—56/p)=1. This example is no accident, for given any negative 
discriminant, the densities of primes represented by the reduced forms 
(counted properly) always sum to 1/2 (see Exercise 9.17). 

A weaker form of Theorem 9.12, which asserts that ax? + bxy + cy” rep- 
resents infinitely many primes, was first stated by Dirichlet in 1840, though 
his proof applied only to a restricted class of discriminants (see [27, Vol. I, 
pp. 497-502]). A complete proof was given by Weber in 1882 [101], and in 
1954 Briggs [10] found an “elementary” proof (in the sense of the “elemen- 
tary” proofs of the prime number theorem due to Erdés and Selberg). 


D. Ring Class Fields and Generalized Dihedral Extensions 


We will conclude §9 by asking if there is an intrinsic characterization of 
ring Class fields. We know that they are Abelian extensions of K , but which 
ones? The remarkable fact is that there is a purely field-theoretic way to 
characterize ring class fields and their subfields. The key idea is to work 
with the Galois group over Q. We used this strategy in 86 in dealing with 
the genus field, and here it will be similarly successful. For the genus field, 
we wanted Gal(L/Q) to be Abelian, while in the present case we will al- 
low slightly more complicated Galois groups. The crucial] notion is when an 
extension of K is generalized dihedral over Q. To define this, let K be an 
imaginary quadratic field, and Jet L be an Abelian extension of K which is 
Galois over Q. As we saw in the proof of Lemma 9.3, complex conjugation 
T is an automorphism of L, and the Galois group Gal(L/K) can be written 
as a semidirect product 


Gal(L/Q) ~ Gal(L/K) » (Z/2Z), 


where the nontrivial element of Z/2Z acts on Gal(L/K) via conjugation by 
T. We say that L is generalized dihedral over Q if this action sends every 
element in Gal(L/K) to its inverse. 
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In Lemma 9.3 we proved that every ring class field L is generalized 
dihedral over Q, and it is easy to show that every subfield of L containing 
K is also generalized dihedral over Q (see Exercise 9.18). The unexpected 
result, due to Bruckner [11], is that this gives all extensions of K which are 
generalized dihedral over Q: 


Theorem 9.18. Let K be an imaginary quadratic field. Then an Abelian ex- 
tension L of K is generalized dihedral over Q if and only if L is contained in 
a ring class field of K. 


Proof. By the above discussion, we know that any extension of K contained 
in a ring class field is generalized dihedral over Q. To prove the converse, 
fix an Abelian extension L of K which is generalized dihedral over Q. By 
Artin Reciprocity, there is an ideal m and a subgroup Pxi(m) C H C Ik(m) 
such that the Artin map induces an isomorphism 


(9.19) Ik(m)/H —~+ Gal(L/K). 


We saw in §8 that all of this remains true when m is enlarged, so that we 
may assume that m = fOx for some integer f, and we can also assume 
that f is divisible by the discriminant dx of K (this will be useful Jater in 
the proof). To prove the theorem, it suffices to show that Pxz(f) C Hf, for 
this will imply that L lies in the ring class field of the order of conductor 
f in Ox. From the definition of Px,z(f), this means that we have to prove 
the following for elements u € Ox: 


(9.20) céZ, c prime to f, w=c mod f > uOx EH. 


The first step is to use the fact that Pxi(fOx)C H: if a,B € Ox are 
prime to f, then we claim that 


(9.21) a =p mod f Ox => (aOK €H <> BOx E ff). 


To prove this, pick an element 7 € Ox such that ay =1mod fOx. Then 
By = 1mod f Ox also holds, so that ayOx and ByYOx both lie in Px i(f Ox) 
CH, and (9.21) follows immediately. One consequence of (9.21) is that 
(9.20) is equivalent to the simpler statement 


(9.22) cé€Z,c prime to f > cOx EH. 


So we need to see how (9.22) follows from L being generalized dihedral 
over Q. Under the isomorphism (9.19), we know that conjugation by 7 on 
Gal(L/K) corresponds to the usual action of T on Ig(f). Then L being 
generalized dihedral over Q means that for a € Ix(f), the class of @ gives 
the inverse of a in Ix(f)/H, which in turn means that aa € H. Since aa = 
N(a)Ox by Lemma 7.14, we see that for any ideal a € Ix(f), we have 


(9.23) N(a)Ox €H. 
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It remains to prove that (9.23) implies (9.22). Note first that it suffices 
to prove (9.22) when c is a prime p not dividing f. Recall that dx |f, 
so that p is unramified in K. There are two cases to consider, depending 
on whether or not p splits in K. If p splits, then p = N(p), where p is a 
prime factor of pOx. Then, by (9.23), we have pOx = N(p)Ox € H, as 
desired. If p doesn’t split, then (dx/p) = —1 by Corollary 5.17. Let q be a 
prime such that g = —p mod f (such primes exist by Dirichlet’s theorem). 
We claim that g splits completely in K . The proof will use the character y 
from Lemma 1.14. Recall that this lemma states that the Legendre symbol 
(dx/-) induces a well defined homomorphism y :(Z/dxZ)* — {+1}, and 
since dx <0, it also tells us that y([—1]) = —1. Since dx | f, we have g= 
—p mod dx, and thus 


(SE) = xan = x-pp = x-ayxcte = - (4) = 1. 


Hence q splits completely in K. The argument for the split case implies 
that qOx € H, and then g=—p mod fOx and (9.21) imply that pOx = 
(— p)Ox €H. This proves (9.22) and completes the proof of Theorem 9.18. 

Q.E.D. 


In Exercises 9.19-9.24, we will explore some other aspects of ring class 
fields, including a computation of the conductor (in the sense of class field 
theory) of a ring class field. For further discussion of ring class fields, see 
Bruckner [11], Cohn [19, §15.1] and Cohn [21, Chapter 8]. 


E. Exercises 


9.1. Prove that the Hilbert class field of an imaginary quadratic field is the 
ring class field of the maximal order. 


9.2. Let O be the order of conductor f in the imaginary quadratic field 
K, and Jet L be the ring class field of O. Let m = fOx, and let T 
denote complex conjugation. 


(a) Show that 7(m) = m and that T(Px2(f)) = Px,z(f). 
(b) Show that ker(®,(z)/xK,m) _ T(ker(®7 /x,m))- 
(c) Using ker(®z xm) = Px,z(f), conclude that ker(®,(z)/x,m) = 
ker(®, /x,m)- 
9.3. Formulate and prove versions of Theorems 9.2 and 9.4 for primes 
represented by the principal form x* + xy + ((1— D)/4)y” when D = 
1 mod 4 is negative. 


9.4. Let uj, 1 = 0,1,2 be as in the proof of Lemma 9.6. If u,; = uz = 0, 
then use Cramer’s rule to prove that ae K. 


9.5. 


9.6. 


9.7. 
9.8. 


9.9. 


9.10. 


9.11. 
9.12. 
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Let L = K(\/m) be a cubic extension of K where m is a cubefree 
integer and K is an imaginary quadratic field. If p is any prime of K 
dividing m, then prove that p ramifies in L. 


Verify that if K = Q(/—3) and L = K(\/m), where m is a cubefree 
integer of the form 273°, then L is one of the four fields listed in 
(9.7). 


Prove that the discriminant of the cubic polynomial x? — a is —27a?. 


Use the arguments outlined in the proof of Proposition 9.5 to show 
that none of the fields K(W3), K(/6) and K(V/12) can be the ring 
class field of the order Z[,/—27]. Hint: use 31 = 2? + 27-1?. 


Prove part (ii) of Proposition 9.5 using the hints given in the text. 


This exercise is concerned with the order Z[9w] of conductor 9 in 

the field K = QW), w = e?/3, 

(a) Prove that L = K(¥V3) is the ring class field of Z[9w]. Hint: 
adapt the proof of Proposition 9.5. 

(b) Use Exercise 9.3 to prove that for primes p> 5, p= x*+xy+ 
61y if and only if p =1 mod 3 and 3 is a cubic residue modu- 
lo p. 

(c) Use (b) to prove that for primes p> 5, 4p = x” + 243y? if and 
only if p = 1 mod 3 and 3 is a cubic residue modulo p. Note that 
this result, conjectured by Euler, was proved earlier in Exercise 
4.15 using the supplementary laws of cubic reciprocity. 


Prove part (ii) of Theorem 9.8. 


This exercise is concerned with the proof of Theorem 9.9. 
(a) Let 6 = 1+ 3w. To prove that (2/0)3 = (0/2), first use (4.10) to 
show 


@: = (735) = 2N(43~)—1/3 = 4 mod (1+ 3w)Ox 


(8) = EE = (Ge seeerom 


=1+w mod 20x, 


and then note that 1+w+w?=0 and 4—w? = —(1+ w)(1+ 
3w). 
(b) Prove part (ii) of Theorem 9.9. 
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9.13. 


9.14. 


9:15; 


9.16. 
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Let K = QWw), w = e?™/3, In this exercise we will use the ring class 

field K(x/3) from Exercise 9.10 to prove the supplementary laws 

of cubic reciprocity. Let p = 1 mod 3 be prime. In Exercise 4.15 we 

saw that 4p = a2 +27b*, which gave us the factorization p = 17 

where a =(a+\/-—27b)/2 is primary. We can assume that a= 

1 mod 3. 

(a) Prove that (w/7)3 = w?@*%)/3_ Hint: use (4.10). 

(b) Adapt the proof of Theorem 9.9 to prove that (3/7); =w”. 
Hint: use Exercise 9.10. 

(c) Use 3 = —w?(1—w) to prove that (1—w/m)3 = w?*@+2)/3, 

(d) Show that the results of (a) and (c) imply the supplementary 
laws for cubic reciprocity as stated in (4.13). 


Let S and S’ be the two sets of primes defined in the proof of The- 
orem 9.12. Prove that S C S'. Hint: use (9.14). 


Let K be an imaginary quadratic field, and let K Cc L be an Abelian 
extension which is generalized dihedral over Q. If o € Gal(L/K) C 
Gal(L/Q), then prove that the conjugacy class (a) of 0 in Gal(L/Q) 
is the set {0,071}. 


In this exercise we will use (8.16) to give a different proof of The- 
orem 9.12. We will use the notation of the proof of Theorem 9.12. 
Thus O is the order of conductor f in an imaginary quadratic field 
K, and L is the ring class field of O. Let 


S = {p prime: p = ax? + bxy +cy’}. 


(a) If ax? + bxy +cy? gives us the class [a9Ox] € Ix(f)/Pxz(f); 
show that 


S = {p primes: p/f, pOx = pp, p € [a0Ox]}- 


Hint: use (9.14). 
(b) Use the Cebotarev Density Theorem to show that 


SS {p EPe: pe [apOx}} 


has Dirichlet density 6(S") = 1/h(D). Then use (8.16) to show 
that 6(S"NPx1) =1/h(D). Recall that Px = {p € Px: N(p) 
is prime}. 

(c) Show that the mapping p> N(p) from S”N Px, to S is either 
two-to-one or one-to-one, depending on whether or not apOx 
has order < 2 in the class group. Then use (b) to prove Theorem 
9.12. 


9.17. 


9.18. 


9.19. 


9.20. 
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Fix a negative discriminant D. 

(a) Show that the sum of the densities of the primes represented 
by the reduced forms of discriminant D with middle coefficient 
b > 0 is always 1/2. 

(b) To explain the result of (a), first use Lemma 2.5 to show that 
the primes represented by the forms listed in (a) are, up to a 
finite set, exactly the primes for which (D/p) = 1. Then use the 
Cebotarev Density Theorem to show that this set has density 
1/2. 


Let K be an imaginary quadratic field. Use Lemma 9.3 to prove 
that any intermediate field between K and a ring class field of K is 
generalized dihedral over Q. 


An imaginary quadratic field K has infinitely many ring class fields 

associated with it. In this exercise we will work out the relation be- 

tween the different ring class fields. 

(a) If O, and ©2 are orders in K, then we get ring class fields L 
and L»2. Prove that 


O,cC O23 12C 1}. 


(b) If fj is the conductor of O;, then prove that O01 C ©? if and only 
if f2| fi, and conclude that the result of (a) can be stated in 
terms of conductors as follows: 


hlfiixalclh. 


In Exercise 9.24, we will see that the converse of this implication 
is false. 

(c) Show that the Hilbert class field is contained in the ring class 
field of any order, and conclude that h(dx) | h(f?dx). This fact 
was proved earlier in Theorem 7.24. 


Let L be the ring class field of an order © in an imaginary quadratic 
field K. Such a field has two “conductors” associated to it: first, 
there is the conductor f of the order ©, and second, there is the 
class field theory conductor f(L/K) of L as an Abelian extension of 
K. There should be a close relation between these conductors, and 
the obvious guess would be that 


(L/K) = fOx. 
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In Exercises 9.20-9.23, we will show that the answer is a bit more 
complicated: the conductor is given by the formula 


Ox, f =20r3,K = Q(v-3) 
Ox, f=2, K = Q(1) 
f(L/K) = | | 
(f/2)Ox,  f =2f', f' odd, 2 splits completely in K 
fOr, otherwise. 


To begin the proof, let f be a positive integer, and let K be an 
imaginary quadratic field. Assume that f = 2f', where f’ is odd and 
2 splits completely in K. Let L and L’ be the ring class fields of 
K corresponding to the orders of conductor f and f’ respectively. 
Then prove that 


f(L/K) = f(L'/K). 


Hint: first show that L' c L, and then use Theorem 7.24 to conclude 
that L' = L. 


Let L be the ring class field of the order of conductor f in an imag- 
inary quadratic field K , and assume that f(L/K) # fOx. 


(a) Show that fOx = pm, where p is prime and f(L/K)|m. We 
will fix p and m for the rest of this exercise. 

(b) Prove that Ik(f)N Pxi(m) C Px z(f). 

(c) Show that there is an exact sequence 


O% — (Ox/fOx)* + Px OIk(f)/Pea(f) — 1, 


where Px is the group of all principal ideals and ¢ is the map 
which sends [a] € (Ox/fOx)* to [@Ox] € Px NO Ik(f)/Pra(f). 
Hint: This is similar to what we did in (7.27). 


(d) Consider the natural maps 
™:(Ox/fOx)” — (Ox/m)" 
B:(Z/fZ)" — (Ox/fOx)". 


Show that ker(7) C OF -Im(@). Hint: use (b) and the exact se- 
quence of (c) to show that ¢~!(Ik(f)N Pxi(m)) = Of - ker(z) 
and $~*(Px,2(f)) = Ox - Im(). 
In this exercise we will assume that OF = {+1} (by Exercise 5.9, this 
excludes the fields Q(,/—3) and Q(i)). Let K, f and L be as in the 
previous exercise, and assume in addition that if f = 2f', f' odd, 
then 2 doesn’t split completely in K. Our goal is to prove that 


(L/K) = fOr. 
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We will argue by contradiction. Suppose that f(L/K) # fOkx. 
Then Exercise 9.21 implies that fOx = pm, where p is prime and 
f(L/K)|m. Furthermore, if 7 and § are the natural maps 


™ :(Ox/fOx)” — (Ox/m)" 
B:(Z/fZ)" — (Ox/fOx)” 


then Exercise 9.21 also implies that ker(7) C O% -Im(@), and since 
Ox = {+1}, we see that 


ker(7) C Im(). 


We will show that this inclusion leads to a contradiction. 
(a) Prove that 


N(p), p|[m 


[ker(x)| = | 
N(p)-1, pm, 
where p is the unique integer prime contained in p. Hint: use 
Exercise 7.29. 
(b) Note that N(p) = p or p*. Suppose first that N(p) = p. 
(i) Show that m = mp for some integer m. 

(ii) Use (i) to show that the map (Z/fZ)* — (Ox/m)* is injec- 
tive, and conclude that ker(7)N Im(@) = {1}. 

(iii) Since ker(a) C Im({), (ii) implies that ker(7) = {1}. Use 
(a) to show that p = 2, 2 splits completely in K, and f = 
2m where m is odd. This contradicts our assumption on f. 

(c) It remains to consider the case when N(p) = p?. Here, f = pm 
and m= mOx. 

(i) Show that ker(7)NIm() ~ ker(@), where 6:(Z/fZ)* — 
(Z/mZ)* is the natural map. 

(ii) Since ker(7) C Im(@), (i) implies that |ker(z)| < | ker(@)|, 
and we know |ker(7)| from (a). Now compute | ker(@)| and 
use this to show that |ker(7)| <|ker(@)| is impossible. 
Again we have a contradiction. 


Recall the formula for the conductor f(L/K) stated in Exercise 9.20. 

(a) Using Exercises 9.20 and 9.22, prove the desired formula when 
OF = {+1}. 

(b) Adapt the proof of Exercise 9.22 to the case O% # {+1}, and 
prove the formula for f(L/K) for all K. 
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9.24. Use the conductor formula from Exercise 9.20 to give infinitely many 
examples where f(L/K) # fOx. Also show that the converse of part 
(b) of Exercise 9.19 is not true in general (i.e., L2 C L; need not 
imply f2 | f1). 


CHAPTER THREE 


COMPLEX MULTIPLICATION 


§10. ELLIPTIC FUNCTIONS AND COMPLEX MULTIPLICATION 


In Chapter Two we solved our problem of when a prime p can be written in 
the form x? + ny”. The criterion from Theorem 9.2 states that, with finitely 
many exceptions, 


, ; (—n/p) =1 and f,(x)=0 mod p 
p=x’+ny’ <= 
has an integer solution. 

The key ingredient is the polynomial f,(x), which we know is the minimal 
polynomial of a primitive element of the ring class field of Z[,/—n]. But the 
proof of Theorem 9.2 doesn’t explain how to find such a primitive element, 
so that we have only an abstract solution of the problem of p = x7 + ny?. 
In this chapter, we will use modular functions and the theory of complex 
multiplication to give a systematic method for finding fn(x). 

In §10 we will study elliptic functions and introduce the idea of complex 
multiplication. A key role is played by the j-invariant of a lattice, and we 
will show that if © is an order in an imaginary quadratic field K, then its 
j-invariant j(Q) is an algebraic number. But before we can get to the real 
depth of the subject, we need to learn about modular functions. Thus 811 
will present a brief but complete account of the main properties of modular 
functions, including the modular equation. Then we will prove that j(O) is 
not only an algebraic integer, but also that it generates (over K) the ring 
class field of O. This theorem, often called the “First Main Theorem” of 
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complex multiplication, is the main result of §11. In §12 we will compute 
j(O) in some special cases, and in §13 we will complete our study of j(O) 
by describing an algorithm for computing its minimal polynomial (the so- 
called “class equation”). When applied to the order Z[,/—n], this theory 
will give us an algorithm for constructing the polynomial f,(x) that solves 
p=x’+ny’. Finally, in §14 we will discuss elliptic curves and primality 
testing. 

Before we can begin our discussion of complex multiplication, we need 
to learn some basic facts about elliptic functions and j-invariants. 


A. Elliptic Functions and the Weierstrass o-Function 


To start, we define a lattice to be an additive subgroup L of C which is gen- 
erated by two complex numbers w, and w, which are linearly independent 
over R. We express this by writing L = [w,,w2]. Then an elliptic function for 
L is a function f(z) defined on C, except for isolated singularities, which 
satisfies the following two conditions: 


(i) f(z) is meromorphic on C. 
(ii) f(z +w) = f(z) for allwe L. 
If L = [wj,w2], note that the second condition is equivalent to 


f(z tur) = f(z + 2) = f(z). 


Thus an elliptic function is a doubly periodic meromorphic function, and 
elements of L are often referred to as periods. 

One of the most important elliptic functions is the Weierstrass -func- 
tion, which is defined as follows: given a complex number z not in the 
lattice L, we set 


1 1 1 
mina F(a): 
weEL—{0} 


When working with a fixed lattice L, we will usually write g(z) instead of 
(o(z; L). Here are some basic properties of the -function: 


Theorem 10.1. Let ((z) be the Weierstrass g-function for the lattice L. 


(i) ¢(z) is an elliptic function for L whose singularities consist of double 
poles at the points of L. 


(ii) (2) satisfies the differential equation 


p'(z)’ = 49(z)° — g2(L)p(z) — g3(L), 
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where the constants g2(L) and g3(L) are defined by 


1 

g(L)= 0 7 
weEL—{0} 

1 

g3x(L) = 140 > 5 
weEL— {0} 


(iii) (Zz) satisfies the addition law 


= (2) p(w) +1 (POLY 
ple + w) = -p(2)- p(w) + 5 (B= EO) 


whenever z,w¢ Landz+w¢L. 
Proof. The first step is to prove the following lemma: 


Lemma 10.2. If L is a lattice and r > 2, then the series 


converges absolutely. 


Proof. If L = [w,w2], then we need to show that the series 
y ee = 
sera |w |" oar |mw, + nw|" 


converges, where es denotes summation over all ordered pairs (m,n) # 
(0,0) of integers. If we let M = min{|xw) + yw2|:x7 + y? =1}, then it is 
easy to see that for all x,y ER, 


|xw, + yw2| > MV/x2+ y2 


(see Exercise 10.1), and it follows that 
1 1 1 
Vee agian U ees ae 
2 [mw + nw2|" — M ys (m? + n?)r/2 
m,n m,n 
By comparing the sum on the right to the integral 


1 
—.—_.—, dxdy, 
[peter xay 


it is easy to show that the sum in question converges when r > 2 (see Exer- 
cise 10.1). Q.E.D. 
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We can now show that o(z) is holomorphic outside L. Namely, if 2 is a 
compact subset of C missing L, it suffices to show that the sum in 


p=4+ D (— 5 =) 


weEeL—{0} 


converges absolutely and uniformly on (2. Pick a number R such that |z| < 
R for all z € Q. Now suppose that z € 2 and that w € L satisfies |w| > 2R. 
Then |z-—w| > 5 |wI, and one sees that 


1 i 


(z-wY w] 


2(2w — Zz) 
w*(zZ—w)? 


Since |w| > 2R holds for all but finitely many elements of L, it follows from 
Lemma 10.2 that the sum in the g-function converges absolutely and uni- 
formly on 22. Thus ¢(z) is holomorphic on C — L and has a double pole at 
the origin. 

Notice that since (—z —w)? = (z — (—w))’, the identity o(—z) = o(z) fol- 
lows immediately from absolute convergence. Thus the g-function is an 
even function. 

To show that (z) is periodic is a bit trickier. We first differentiate the 
series for o(z) to obtain 


< RQlwl + 4|w|) _ 10R 
lw? lw)? lwP 


p'(2) = 2) Gao can. 


Arguing as above, this series converges absolutely, and it follows easily that 
g'(z) is an elliptic function for L (see Exercise 10.2). Now suppose that 
L = [w1,w2]. The functions o(z) and o(z+w,) have the same derivative 
(since g’(z) is periodic), and hence they differ by a constant, say ((z) = 
o(z +w,) + C. Evaluating this at —w,/2 (which is not in L), we obtain 


o(—w,/2) = p(—w,/2 + w,) + C = p(w, /2) + C. 


Since (Zz) is an even function, C must be zero, and periodicity is proved. 
It follows that the poles of o(z) are all double poles and lie exactly on the 
points of L, and (1) is proved. 

Turning to (11), we will first compute the Laurent expansion of (z) about 
the origin: 


Lemma 10.3. Let o(z) be the g-function for the lattice L, and let G,(L) be 
the constants defined in Lemma 10.2. Then, in a neighborhood of the origin, 
we have 


Ts ote ' 
p(Zj= 5+ S\(2n + 1)Grn42(L)Z™". 


n=l 
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Proof. For |x| <1, we have the series expansion 


any =1+ mtx 


(see Exercise 10.3). Thus, if |z| <|w|, we can put x = z/w in the above 
series, and it follows easily that 


1 1 antl, 
Ga ae aon 
n=1 


Summing over all w € L — {0} and using absolute convergence, we obtain 
1 fe @) 
p(2)= 5 + dn +1)Gn42(L)z". 
n= 


Since the g(z) is an even function, all of the odd coefficients must vanish, 
giving us the desired Laurent expansion. Q.E.D. 


From this lemma, we see that 


; 2 < = 
(Zz) = eel »; 2n(2n + 1)Grn42(L)z"1, 


n=1 


and then one computes the first few terms of (z)? and g'(z)’ as follows: 


p(z) = at 20K") + 15G6(L) + - 
g'(z)? = 4 NEA) gugeuy 


where +--- indicates terms involving positive powers of z (see Exercise 
10.4). Now consider the elliptic function 


F(z) = g'(z)’ — 49(zy> + 60Gy(L)p(z) + 140G6(L). 


Using the above expansions, it is easy to see that F(z) vanishes at the ori- 
gin, and then by periodicity, F(z) vanishes at all points of L. But it is also 
holomorphic on C — L, so that F(z) is holomorphic on all of C. An easy ar- 
gument using Liouville’s Theorem shows that F(z) is constant (see Exercise 
10.5), so that F(z) is identically zero. Since g2(L) and g3(L) were defined 
to be 60G,4(L) and 140G,(L) respectively, the proof of (ii) is complete. 

In order to prove (iii), we will need the following lemma: 


Lemma 10.4. If z,w ¢ L, then ¢(z) = e(w) if and only if z= +w mod L. 
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Proof. The < direction of the proof is trivial since ¢(z) is an even func- 
tion. To argue the other way, suppose that L = [w,,w 2], and fix a number 
—1<6<0. Let P denote the parallelogram {sw, + tw2:6<s5,t<6+1}, 
and Jet [ be its boundary oriented counterclockwise. Note that every com- 
plex number is congruent modulo L to a number in P (see Exercise 10.6). 
Fix w and consider the function f(z) = ¢(z) — e(w). By adjusting 6, we 
can arrange that f(z) has no zeros or poles on I’. Then it is well known 


that 
Af LQa: 
2mt Jp f(z) 

where Z (resp. P) is the number of zeros (resp. poles) of f(z) in P, count- 
ing multiplicity. Since f'(z)/f(z) is periodic, the integrals on opposite sides 
of I cancel, and thus f.(f'(z)/f(z))dz = 0. This shows that Z = P. How- 
ever, P is easy to compute: from the definition of P, it’s obvious that 0 
is the only pole of f(z) = e(z) — e(w) in P. It’s a double pole, and thus 
Z = P =2, so that f(z) has two zeros (counting multiplicity) in P. 

There are now two cases to consider. If w # —w mod L, then modulo L, 
w and —w give rise to two distinct points of P, both of which are zeros of 
f(z) = e(z)— E(w). Since Z = 2, these are all of the zeros, and their mul- 
tiplicity is one, i.e., o’(w) # 0. If w =—w mod L, then 2w € L. Since p'(z) 
is an odd function (being the derivative of an even function), we obtain 


p(w) = p(w — 2w) = g'(—w) = —¢'(w), 


which forces ¢’(w) = 0. Thus modulo L, w gives rise to a zero of f(z) of 
multiplicity > 2 in P, and again Z = 2 implies that these are all. This proves 
the lemma. Q.E.D. 


The proof of Lemma 10.4 yields the following useful corollary: 
Corollary 10.5. If w ¢ L, then ¢'(w) = 0 if and only if 2w € L. Q.E.D. 


Now we can finally prove the addition theorem. Fix w ¢ L, and consider 
the elliptic function 


p'(Z) — @ Oe 
(Zz) — p(w) 


If we can show that G(z) is holomorphic on C and vanishes at the origin, 
then as in (ii), Liouville’s Theorem will imply that G(z) vanishes identically, 
and the addition theorem will be proved. 

Using Lemma 10.4, we see that the possible singularities of G(z) come 
from three sources: L, L + {w} and L— {w}. By periodicity, it suffices to 


G(z) = plz + w) + plz) + p(w) ‘(2 
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consider G(0), G(w) and G(—w). Let’s begin with G(0). Using the Laurent 
expansions for ¢(z) and ¢'(z), one sees that 
1 GS en)" at Eo = p(w) + = 
4\ p(2)— eC) 4\ 1/z?— p(w) +-- 


where as usual, +--- means terms involving positive powers of z. Hence 


G(z2) = OZ +) + pW) + —2W)—, 


and it follows that G(0) = 0. 
To simplify the remainder of the argument, we will assume that 2w ¢ L. 
Turning to G(w), we use LHospital’s Rule to obtain 


1 (p(w)? 
= Q(2 -— ‘ 
GOW) = pw) + 2o(w)— 3 (ES 
Since 2w ¢ L, Corollary 10.5 shows that g’(w) #0, and thus G(w) is de- 
fined. It remains to consider G(—w). We begin with some Laurent expan- 
sions about z = —w: 


pet w= ot 


p(2) = p(—w) + p(—w)z tw) ++ = plw)— piwztw)t--. 


where +--- now refers to higher powers of z + w. Since ¢'(w) # 0, these 
formulas make it easy to show that G(—w) is defined (see Exercise 10.7). 
This shows that G(z) is holomorphic and vanishes at 0, so that G(z) van- 
ishes everywhere. 

To complete the proof, we need to consider the case 2w € L. We leave 
this to the reader (see Exercise 10.7). We have now proved all three parts 
of Theorem 10.1. Q.E.D. 


There are many more results connected with the Weierstrass -function, 
and we refer the reader to Chandrasekharan [16, Chapter III], Lang [73, 
Chapter 1] or Whittaker and Watson [109, Chapter XX] for more details. 


B. The j-Invariant of a Lattice 


Elliptic functions depend on which lattice is being used, but sometimes dif- 
ferent lattices can have basically the same elliptic functions. We say that 
two lattices L and L’ are homothetic if there is a nonzero complex number 
A such that L’ = AL. Note that homothety is an equivalence relation. It is 
easy to check how homothety affects elliptic functions: if f(z) is an elliptic 


206 §10 ELLIPTIC FUNCTIONS AND COMPLEX MULTIPLICATION 


function for L, then f(Az) is an elliptic function for AL. Furthermore, the 
go-function transforms as follows: 


GAZ ALIaH A oe): 


Thus we would like to classify lattices up to homothety, and this is where 
the j-invariant comes in. 

Given a lattice L, we have the constants g2(L) and g3(L) which appear 
in the differential equation for o(z). It is customary to set 


A(L) = g2(L)° — 27g3(L)’. 


The number A(ZL) is closely related to the discriminant of the polynomial 
4x? — go(L)x — g3(L) that appears in the differential equation for o(z). In 
fact, if e;, e2 and é3 are the roots of this polynomial, then one can show 
that 


(10.6) A(L) = 16(e; — e2)*(e1 — e3)°(e2 — e3)° 


(see Exercise 10.8). An important fact is that A(L) never vanishes, i.e., 


Proposition 10.7. Jf L is a lattice, then A(L) # 0. 


Proof. If wg L and 2w e€L, then Corollary 10.5 implies that p'(w) = 0. 
Then the differential equation from Theorem 10.1 tells us that 


0 = p'(w)” = 4p(w)? — g2(L) p(w) — 83(L), 


so that p(w) is a root of 4x? — go(L)x — g3(L). If L = [w),w2], this process 
gives three roots p(w; /2), o(W2/2) and ((w; + w2)/2), which are distinct 
by Lemma 10.4 since +w,/2, tw2/2 and +(w; + w2)/2 are distinct modulo 
L. Thus the roots of 4x? — g2(L)x — g3(L) are distinct, and A(L) # 0 by 
(10.6). Q.E.D. 


The j-invariant j(L) of the lattice L is defined to be the complex num- 
ber 


(LY _ say B( LP 


(10.8) i(L) = 128s Deb = 18 Ay 


Note that /(L) is always defined since A(L) # 0. The reason for the factor 
of 1728 will become clear in 811. The remarkable fact is that the j-invariant 
jJ(L) characterizes the lattice L up to homothety: 


Theorem 10.9. Jf L and L' are lattices in C, then j(L) = j(L’) if and only 
if L and L' are homothetic. 
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Proof. It is easy to see that homothetic lattices have the same /-invariant. 
Namely, if A € C*, then the definition of g(L) and g3(L) implies that 


82(AL) = A~“g0(L) 
g3(AL) = A~°g3(L), 


and j(AL) = j(L) follows easily. 
Now suppose that ZL and L’ are lattices such that j(L) = j(L’). We first 
claim that there is a complex number (A such that 


82(L') = X~“g2(L) 
§3(L') = A~°g3(L). 
When g2(L’) # 0 and g3(L’) # 0, we can pick a number A such that 
4 §2(L) 
82(L') 
Since j(L) = j(L’), some easy algebra shows that 


(10.10) 


(10.11) 


so that 
6. , 83(L) 
g3(L') 

Replacing A by 1A if necessary, we can assume that the above sign is+, and 
then (10.11) follows. The proof when g2(L’) = 0 or g3(L’) = 0 is similar and 
is left to the reader (see Exercise 10.9). 

To exploit (10.11), we need to learn more about the Laurent expansion 
of the g-function: 


Lemma 10.12. Let ¢(z) be the g-function for the lattice L, and as in Lemma 
10.3, let 


1 << : 
p(Z)=a+ 20" + 1)G2n42(L)z” 


be its Laurent expansion. Then for n > 1, the coefficient (2n + 1)G2n42(L) 
of z*" is a polynomial with rational coefficients, independent of L, in g2(L) 
and g3(L). 


Proof. For simplicity, we will write the coefficients of the Laurent expan- 
sion as Ay, = (22 + 1)G2n+2(L). To get a relation among the a,’s, we differ- 
entiate the equation !(z)* = 49(z)’ — g2(L)e(z) — g3(L) to obtain 


p"(z) = 6p(z)” — (1/2)ga(L). 
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By substituting in the Laurent expansion for o(z) and comparing the coef- 
ficients of z7"—-?, one easily sees that for n > 3, 


n—2 
2n(2n — 1)a, = 6 (20 +t tai) 


i=1 
(see Exercise 10.10), and hence 


n—2 
(2n + 3)(n — 2)an = 3S jan—1-i. 
i=1 


Since g2(L) = 60G4(L) = 20a; and g3(L) = 140G6(L) = 28a2, an easy in- 
duction shows that a, is a polynomial with rational coefficients in g2(L) 
and g3(L). This proves the lemma. Q.E.D. 


Now suppose that we have lattices L and L' such that (10.11) holds 
for some constant A. We claim that L' = AL. To see this, first note that 
by (10.10), we have g2(L’) = g2(AL) and g3(L’) = g3(AL). Then Lemma 
10.12 implies that g(z; L') and g(z; AL) have the same Laurent expan- 
sion about 0, so that the two functions agree in a neighborhood of the 
origin, and hence ¢(z; L’) = e(z; AL) everywhere. Since the lattice is the 
set of poles of the g-function, this proves that L' = AL, and the theorem is 
proved. Q.E.D. 


Besides the notion of the j-invariant of a lattice, there is another way 
to think about the j-invariant which will be useful when we study modular 
functions. Given a complex number 7 in the upper half plane h = {rT EC: 
Im(7T) > 0}, we get the lattice [1,7], and then the j-function j(T) is defined 
by 


J(r) = J, 7))- 


The analytic properties of j(7) play an important role in the theory of com- 
plex multiplication and will be studied in detail in §11. 


C. Complex Multiplication 


We begin with the simple observation that orders in imaginary quadratic 
fields give rise to a natural class of lattices. Namely, let O be an order in 
the imaginary quadratic field K , and let a be a proper fractional O-ideal. 
We know from §7 that a = [a,f] for some a, € K (see Exercise 7.8). We 
can regard K as a subset of C, and since K is imaginary quadratic, a and 
@ are linearly independent over R (see Exercise 10.11). Thus a = [a, §] is a 
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lattice in C, and consequently the j-invariant j(a) is defined. These com- 
plex numbers, often called singular moduli, have some remarkable proper- 
ties which will be explored in §11. For now, we have the more modest goal 
of trying to motivate the idea of complex multiplication. 

In order to simplify our discussion of complex multiplication, we will 
fix the lattice L. As usual, o(z; L) is written o(z), and to simplify things 
further, g2(L) and g3(L) will be written go and g3. The basic idea of com- 
plex multiplication goes back to the addition law for the o-function, proved 
in part (111) of Theorem 10.1. If we specialize to the case z=w, then 
UHospital’s rule gives the following duplication formula for the g-function: 


Bi@)Y 
oO) | 


However, the differential equation from Theorem 10.1 implies that 


(10.13) (2z) = —2¢(z) + ; ( 


p'(z)° = 4(z)" — 820(z) — 83 
p"(z) = 6(z)” — (1/2)g2, 
and substituting these expressions into (10.13), we obtain 


(I2p(z) =e) 
16(4p(2)° = g20(2)— 83) 


Thus (2z) is a rational function in o(z). More generally, one can show 
by induction that for any positive integer n, (nz) is a rational function in 
go(z) (see Exercise 10.12). So the natural question to ask is whether there 
are any other complex numbers @ for which (az) is a rational function in 
o(z). The answer is rather surprising: 


(2z) = —2p(z)+ 


Theorem 10.14. Let L be a lattice, and let o(z) be the g-function for L. 
Then, for a number a € C — Z, the following statements are equivalent: 
(1) ~(a@z) ts a rational function in (Q(z). 
(io mre y Ox ay Oe 
(iil) There is an order O in an imaginary quadratic field K such that a € O 
and L is homothetic to a proper fractional O-ideal. 
Furthermore, if these conditions are satisfied, then (az) can be written in 
the form 
_ A(elz)) 
B(p(Z)) 


where A(x) and B(x) are relatively prime polynomials such that 


(az) 


deg(A(x)) = deg(B(x)) + 1=[L: aL] = N(a). 


210 §10 ELLIPTIC FUNCTIONS AND COMPLEX MULTIPLICATION 


Proof. (i) = (il). If g(a@z) is a rational function in g(z), then there are 
polynomials A(x) and B(x) such that 


(10.15) B(((Z))p(az) = A(p(z)). 


Since (z) and (az) have double poles at the origin, it follows from (10.15) 
that 


(10.16) deg( A(x)) = deg(B(x)) + 1. 


Now let w € L. Then (10.15) and (10.16) show that p(a@z) has a pole at w, 
which means that o(z) has a pole at aw. Since the poles of o(z) are exactly 
the period lattice L, this implies that aw € L, and aL C L follows. 

(ii) = (i). If aL CL, it follows that @(a@z) is meromorphic and has L 
as a lattice of periods. Furthermore, note that o(a@z) is an even function 
since ¢o(z) is. Then the following theorem immediately implies that o(a@z) is 
a rational function in ¢(z): 


Lemma 10.17. Any even elliptic function for L is a rational function in ¢(z). 
Proof. This proof of this assertion 1s covered in Exercise 10.13. Q.E.D. 


(11) = (iii). Suppose that aL C L. Replacing L by AL for suitable A, we 
can assume that L = [1,7] for some 7€ C—R. Then aL C L means that 
a=a+br and at =c+dr7 for some integers a, 5, c and d. Taking the 
quotient of the two equations, we obtain 


c+drt 
a+ br’ 


which gives us the quadratic equation 
br? +(a—d)r—c=0. 


Since 7 is not real, we must have b # 0, and then K = Q(T) is an imaginary 
quadratic field. It follows that 


O={8BEK:BLCL} 


is an order of K for which L is a proper fractional O-ideal, and since a Is 
obviously in ©, we are done. 
(iil) = (a1). This implication is trivial. 
Finally, to prove the last statement of the theorem, suppose that 
_ A(p(z)) 
B(p(z)) 
By (10.16), we know that deg( A(x)) = deg(B(x)) +1, and in Corollary 


(10.18) (AZ) 
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11.27, we will show that N(a@) = [L: aL]. It remains to prove that the de- 
gree of A(x) 1s the index [L: aL]. 

Fix z€C such that 2z ¢(1/a)L, and consider the polynomial A(x)— 
go(az)B(x). This polynomial has the same degree as A(x), and z can be 
chosen so that it has distinct roots (see Exercise 10.14). Then consider the 
lattices L C (1/a)L, and let {w,} be coset representatives of L in (1/a)L. 
We claim that 


(10.19) 
The o(z + w,) are distinct and give all roots of A(x) — e(az)B(x). 


This will imply deg( A(x)) = [(1/a)L: L] =[L: aL], and the theorem will 
be proved. 

To prove (10.19), we first show that the o(z+w,) are distinct. If not, 
we would have (z+ w,) = o(z+w,) for some i # j. Then Lemma 10.4 
implies that z+ w, = +(z+w,) mod L. The plus sign implies w, = w, mod 
L, which contradicts 1 # j, and the minus sign implies 2z = w, — w, mod L, 
which contradicts 2z ¢ (1/a)L. Thus the o(z + w,) are distinct. 

From (10.18), we see that A(y(z + w,)) = e(a(z + w,))B(p(z + w,)). But 
w, €(1/a)L, so that a(z+w,)=azmodL, and hence g(a(z+w,))= 
(az). This shows that the o(z+w,) are roots of A(x)— p(az)B(x). To 
see that all roots arise this way, let u be another root. Note that B(u) 4 0 
since B(u) = 0 implies A(u) = 0, which is impossible since A(x) and B(x) 
are relatively prime. By adapting the argument of Lemma 10.4, it is easy to 
see that u = o(w) for some complex number w (see Exercise 10.14). Then 


and using Lemma 10.4 again, we see that aw = taz mod L. Changing w to 
—w if necessary (which doesn’t affect u = p(w) = p(—w)), we can assume 
that w = z mod (1/a@)L. Working modulo L, this means w = z+ w, mod L 
for some J, and thus u = ¢(w) = e(z + w,) is one of the known roots. This 
proves (10.19), and we are done with Theorem 10.14. Q.E.D. 


This theorem shows if an elliptic function has multiplication by some 
aé€C-—R, then it has multiplication by an entire order © in an imaginary 
quadratic field. Notice that all of the elements of © — Z are genuinely com- 
plex, i.e., not real. This accounts for the name complex multiplication. 

One important consequence of Theorem 10.14 is that complex multi- 
plication is an intrinsic property of the lattice. So rather than talk about 
elliptic functions with complex multiplication, 1t makes more sense to talk 
about lattices with complex multiplication. Since changing the lattice by a 
constant multiple doesn’t affect the complex multiplications, we will work 
with homothety classes of lattices. 


212 $10. ELLIPTIC FUNCTIONS AND COMPLEX MULTIPLICATION 


Using Theorem 10.14, we can relate homothety classes of lattices and 
ideal class groups of orders as follows. Fix an order © in an imaginary 
quadratic field, and consider those lattices L Cc C which have O as their full 
ring of complex multiplications. By Theorem 10.14, we can assume that L 
is a proper fractional O-ideal, and conversely, every proper fractional O- 
ideal is a lattice with O as its ring of complex multiplications. Furthermore, 
two proper fractional O-ideals are homothetic as lattices if and only if they 
determine the same class in the ideal class group C(O) (see Exercise 10.15). 
We have thus proved the following: 


Corollary 10.20. Let O be an order in an imaginary quadratic field. Then 
there is a one-to-one correspondence between the ideal class group C(O) and 
the homothety classes of lattices with O as their full ring of complex multipli- 
cations. Q.E.D. 


It follows that the class number h(Q) tells us the number of homothety 
classes of lattices having O as their full ring of complex multiplications. 

Here are some examples. First, consider all lattices which have complex 
multiplication by /—3. This means that we are dealing with an order O 
containing /—3 in the field K = Q(/—3). Then O must be either Z[,/—3] 
or Z[w], w = e?™/3, and since both of these have class number 1, the only 
lattices are [1,./—3] and [1,w]. Thus, up to homothety, there are only two 
lattices with complex multiplication by —3. Next, consider complex multi- 
plication by /—5. Here, K = Q(/—5), and the only order containing /—5 
is the maximal order Ox = Z[,/—5]. The class number is h(—20) = 2, and 
since we know the reduced forms of discriminant —20, the results of §7 
show that up to homothety, the only lattices with complex multiplication by 
V—5 are [1, /—5] and [2,1 + /—5] (see Exercise 10.16). 

The discussion so far has concentrated on the elliptic functions and 
their lattices. Since our ultimate goal involves the j-invariant of the lattices, 
we need to indicate how complex multiplication influences the j-invariant. 
Let’s start with the simplest case, complex multiplication by i = /—1. Up 
to a multiple, the only possible lattice is L = [1,7]. To compute j(L) = j(Z), 
note that iL = L, so that by the homogeneity (10.11) of g3(L), 


§3(L) = g3(iL) = 1 °g3(L) = —83(L). 


This implies that g3(L) = 0, and then the formula (10.8) for the j-invariant 
tells us that j(Z) = 1728. Similarly, one can show that if L = [lw], w= 
e?™/3 then g2(L) = 0, which tells us that j(w) = 0 (see Exercise 10.17). 

A more interesting example is given by complex multiplication by /—2. 
By the above methods, the only lattice involved is [1,/—2], up to homoth- 
ety. We will follow the exposition of Stark [97] and show that 


j(V—2) = 8000. 
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Since N(/—2) = 2, Theorem 10.14 tells us that 
/~5,) = Ae) 
p(Vv—2z) = 
B(p(Z)) 
where A(x) is quadratic and B(x) is linear. Dividing B(x) into A(x), we 
can write this as 


1 
10.21 _ = cS 
(10.21) e(V=22) = ap(z) + b+ a, 
where a and c are nonzero complex numbers. To exploit this identity, we 
will use the Laurent expansion of ¢(z) at z = 0. The differential equation 
for ¢(z) shows that the first few terms of the Laurent expansion are 
Ts 2 BU GABA As Be he 
(2) = Fe + 997 + 2g? * 72007 
(this follows easily from the proof of Lemma 10.12—see Exercise 10.18). To 
simplify this expansion, first note that gz and g3 are nonzero, for otherwise 
there would be complex multiplication by 7 or w, which can’t happen for 
L = [1, V/—2] (see Exercise 10.19). Then, replacing L by a suitable multiple, 
the homogeneity of gz and g3 allows us to assume that gz = 20g and g3 = 
28g for some number g (see Exercise 10.19). With this choice of lattice, the 
expansion for ¢(z) can be written 


_i Lge aoe £6 
P(Z)= gz +g2z +52 +--+, 
and it follows that the expansion for o(./—2z) is 
~1 82? 
(Vv —-2z) = ay 2gz* +4gz4~ are +--+, 


Now the constants a and b in (10.21) are the unique constants such that 
((./—2z) — ag(z) — b is zero when z = 0. Comparing the above expansions 
for ¢(z) and 9(./—2z), we see that a = —1/2 and b = 0. Then (10.21) tells 
us the remarkable fact that ((\/—2z) + $@(z))~/ is a linear polynomial in 
o(z). Using the above expansions, one computes that 


«/ ! tee O85 OB ae SE 
G ~—2z)+ 502)) = (-¥ aye — Z +) 


ms 
(10.22) 


(see Exercise 10.19). By (10.21), this expression is linear in ¢(z). Looking at 
the behavior at z = 0, it follows that the bottom line of (10.22) must equal 


2 2 
——. Zz ao, 
55 (2) = 


214 §10. ELLIPTIC FUNCTIONS AND COMPLEX MULTIPLICATION 


and then comparing the coefficients of z* implies that 


Solving this equation for g yields g = 2 , SO that 
5-27 
82 = 20g = > 
7-27 
$3.> 28g = "9°? 
and thus 
i(V-2) = 1728 3 ogi Tr = = 8000 = 20°. 


By a similar computation, One can also show that 


j (A) = ~3375 = (—15)° 


(see Exercise 10.20). In §12 we will explain why these numbers are cubes. 

Besides allowing us to compute j(/—2) and j((1+ V—7)/2), the Lau- 
rent series of the g-function can be used to give an elementary proof that 
the j-invariant of a lattice with complex multiplication is an algebraic num- 
ber: 


Theorem 10.23. Let O be an order in an imaginary quadratic field, and let 
a be a proper fractional O-ideal. Then j(a) is an algebraic number of degree 
at most h(O). 


Proof. By Lemma 10.12, the Laurent expansion of (z) can be written 


1 foe) 
9(Z) = P) + S| an(g2,83)2"", 

n=1 
where each a,(g2,g3) is a polynomial in g2 and g3 with rational coeffi- 
cients. To emphasize the dependence on gz and g3, we will write o(z) as 

(2; 82,83). 

By assumption, for any a € O, (az) is a rational function in ¢(z), say 
A(9(Z; 82,83) 
10.24 QZ; 20,23) = —_ 
aii P07 5 B83) = Boe: B83) 


We then have the Laurent expansion 


((Z ; 82,83) = a + Yo anigngsia™ 
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which means that (10.24) can be regarded as an identity in the field C((z)) 
of formal meromorphic Laurent series. Recall that C((z)) is the field of 
fractions of the formal power series ring C[[z]], so that an element of 
C((z)) is a series of the form }77 _y, bnz", bn EC. 

Now let o be any automorphism of C. Then o induces an automorphism 
of C((z)) by acting on the coefficients. Thus, if we apply a to (10.24), we 
obtain the identity 


B°(9(z; 0(g2),0(83)))’ 


where A°(x) (resp. B7(x)) is the polynomial obtained by applying o to 
the coefficients of A(x) (resp. B(x)). This follows because an(g2,g3) is a 
polynomial in gz and g3 with rational coefficients. We don’t know much 
about o(g2) and o(g3), but g? — 27g? # 0 implies o(g2)? — 270(g3)? # 0. In 
§11, we will prove that this condition on o(g2) and o(g3) guarantees that 
there is a lattice L such that 


(10.25) p(9(a)z; 782), 0(83)) = 


82(L) = o(82) 
§3(L) = o(g3) 


(see Corollary 11.7). Thus the formal Laurent series ¢(z; o(g2),0(g3)) is 
the Laurent series of the g-function o(z; L), and then (10.25) tells us that 
¢o(z;L) has complex multiplication by o(a). This holds for any a € O, so 
that if O' is the ring of all complex multiplications of L, then we have 
proved that 

O=a(O) CO’. 


If we work with o~! and interchange a and L, the above argument shows 
that O' c O, which shows that © is the ring of all complex multiplications 
of both a and L. 

Now consider j-invariants. The above formulas for g2(L) and g3(L) im- 
ply that 


(10.26) J(E) = o(a)). 


Since L has O as its ring of complex multiplications, Corollary 10.20 im- 
plies that there are only h(©) possibilities for j(L). By (10.26), there are 
thus at most h(O) possibilities for o(j(a)). Since o was an arbitrary au- 
tomorphism of C, it follows that j(a) must be an algebraic number, and 
in fact the degree of its minimal polynomial over Q is at most h(O). This 
proves the theorem. Q.E.D. 


In §11 we will prove the stronger result that j(a) is an algebraic inte- 
ger and that the degree of its minimal polynomial equals the class number 
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h(O). But we thought it worthwhile to show what can be done by elemen- 
tary means. Furthermore, the method of proof used above (the action of 
an automorphism on the coefficients of a Laurent expansion) is similar to 
some of the arguments to be given in 811. 

For a more classical introduction to complex multiplication, the reader 
should consult the recent book [9] by Borwein and Borwein. 


D. Exercises 


10.1. This exercise is concerned with the proof of Lemma 10.2. 

(a) If L = [w1,w] is a lattice, let M = min{|xw, + yw2|: x2 + y* = 
1}. Show that M > 0 and that |xw, + yw2| > M./x? + y? for all 
x,yeER. 

(b) Show that the integral f f.., 2,,(x? + y*)-"/2dxdy converges 
when r > 2. 

(c) Show that the series >, ,(m? + n?)-"/? converges when r > 2. 
Hint: compare the series to the integral in part (b). 


10.2. In the proof of Theorem 10.1, we proved that g’(z) = -2)7¢,(z- 
w)73, 
(a) Show that this series converges absolutely for z ¢ L. 
(b) Using (a), show that o'(z + w) = g'(z) for Ze L. 

10.3. Show that for |x| <1, (1— x)? = 0° ,(n + 1)x". Hint: differenti- 
ate the standard identity (1— x)~! = 0°.) x". 


10.4. Use Lemma 10.3 to show that 


pte) = 3 + EO + 15G6(L) + 


p'(z)l = = - 5 - 80G6(L) + ---, 


where +--- indicates terms involving positive powers of z. 


10.5. Use Liouville’s Theorem to show that a holomorphic elliptic function 
f(z) must be constant. Hint: consider |f(z)| on the parallelogram 
{sw 1 +tw2:0<s5,t <1}. Exercise 10.6 will be useful. 


10.6. Let L = [w,w2] be a lattice. For a fixed a € C, consider the paral- 
lelogram P = {a+ 5w, + tw2:0<s,t < 1}. Show that if z EC, then 
there is z’ € P such that z= z' mod P. Note that the parallelogram 
used in Lemma 10.4 corresponds to a = 6w; + dw2. 


10.7. 


10.8. 


10.9. 


10.10. 


10.11. 


10.12. 
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As in the proof of the addition theorem, let 
2 
ees) . 
(2) — p(w) 
(a) If 2w ¢ L, complete the argument begun in the text to show 
that G(—w) is defined. 


b) Prove the addition law when 2w € L. Hint: take a sequence of 
(b) q 
points w; converging to w such that 2w; ¢ L for all i. 


G(z) = (2+) + 0(2) + 90) 3 ( 


Let 4x3 — 2X — g3 be a cubic polynomial with roots e;, e2 and e3. 

(a) Show that e; +e2+e3=0, e1e2 + e1e3 + e2€3 = —g2/4 and 
e1e2e3 = 23/4. 

(b) Using (a), show that g} — 27g? = 16(e1 — e2)*(e1 — €3)*(e2 — e3)*. 


Let L and L’ be lattice such that j(L)=j(L’'). If g.(L’') = 0 or 
g3(L') = 0, prove that there is a complex number 4 such that (10.11) 
holds. Hint: by Proposition 10.7, they can’t both be zero. 


Let the Laurent expansion of the g-function about 0 be ¢(z) = 

ze + Sane where @, = (2n + 1)Gon42 is as in Lemma 10.3. 

(a) Use the differential equation for the g-function to show that 
p"(z) = 6p(z)’ — (1/2)g2(L). 

(b) Use (a) to show that for n > 3, 


n—2 
2n(2n — 1)an = 6 (20 + itn) i 


i=1 


Let K be an imaginary quadratic field, which can be regarded as a 

subfield of C. 

(a) If O is an order in K and a = [a,f] is a proper fractional O- 
ideal, then show that a and f are linearly independent over R. 
Thus ac C is a lattice. 

(b) Conversely, let L CC be a lattice which is contained in K. 
Show that L is a proper fractional O-ideal for some order O 
of K. 


Let L be a lattice, and let n be a positive integer. 

(a) Prove that (nz) is a rational function in ¢(z). Hint: use the 
addition law and induction on n. For a quicker proof, use 
Lemma 10.17. 

(b) Adapt the proof of Theorem 10.14 to show that the numera- 
tor of the rational function of part (a) has degree n* and the 
denominator has degree n* — 1. 
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10.13. 


10.14. 


10.15. 


10.16. 
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In this exercise we will see how to express elliptic functions for a 

given lattice L in terms of (z) and ¢'(z). 

(a) Let f(z) be an even elliptic function which is holomorphic on 
C — L. Prove that f(z) is a polynomial in ¢(z). Hint: show that 
there is a polynomial A(x) such that the Laurent expansion of 
f(z) — A(@(z)) has only terms of nonnegative degree. Then use 
Exercise 10.5. 

(b) Let f(z) be an even elliptic function that has a pole of order 

m at w EC. We will assume that w ¢ L. 

(i) If 2w ¢ L, prove that (¢(z) — e(w))” f(z) is holomorphic 
at w. Hint: use Corollary 10.5. 

(ii) If 2w € L, prove that m is even. Hint: f(z) = f(2w —z). 

(iii) If 2w EL, prove that (p(z)— p(w))”/*f(z) is holomor- 
phic at w. Hint: use the proof of Lemma 10.4 to show 
that p(w) #0. 

Show that an even elliptic function f(z) is a rational function 

in ¢(z). This will prove Lemma 10.17. Hint: write L = [w,,w], 

and consider the parallelogram P = {sw, + tw2:0<s5,t <1}. 

Note that only finitely many poles of f(z) lie in P. Now use 

part (b) to find a polynomial B(x) such that B(p(z))f(z) is 

holomorphic on C — L (use Exercise 10.6). Then the claim fol- 

lows easily by part (a). 

(d) Show that all elliptic functions for L are rational functions in 
9o(z) and p'(z). Hint: 


~ FO*ICD , (1)=$E2) yay 
p'(Z) 

This exercise is concerned with the proof of Theorem 10.14. 

(a) Let A(x) and B(x) be relatively prime polynomials. Prove that 
there are only finitely many complex numbers A such that the 
polynomial A(x)—AB(x) has a multiple root. Hint: show that 
every multiple root is a root of A(x)B'(x)— A’(x)B(x). 

(b) Adapt the proof of Lemma 10.4 to show that for any complex 
number u, the equation u = ¢(w) always has a solution. 


(c 


a 


f(Z) 


Let a and b be two proper fractional O-ideals, where © is an order 
in an imaginary quadratic field. Prove that a and b determine the 
same class in the ideal class group C(Q) if and only if they are 
homothetic as lattices in C. 


In this exercise we study lattices with complex multiplication by a 
fixed ae C. 
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(a) Verify that up to a multiple, the only lattices with complex mul- 
tiplication by /—5 are [1, /—5] and [2,1+ V—5]. 

(b) Determine, up to a multiple, all lattices with complex multi- 
plication by /—14. Hint: see the example following Theorem 
5.2). 

(c) Let K be an imaginary quadratic field of discriminant dx , and 
let a € Ox — Z. Show that up to homothety, the number of lat- 
tices given with complex multiplication by a is given by 


[Ox:2Z[a]] 


S> A(f?dx). 
f= 


10.17. Let w = e?/3, and let L be the lattice [1,w]. Show that g2(L) = 
jw) = 0. 


10.18. Use the proof of Lemma 10.12 to show that in a neighborhood of 
z = 0, the Laurent expansion of the g-function is 


Ny 9 BP Ba 83 me 
(Zz) 7 + 507 + 5R2 Poona 


10.19. This exercise is concerned with the computation j(/—2) = 8000. 

(a) If L is a lattice with go(L) = 0, then prove that L is a multi- 
ple of [1,w], w = e?™/5, Hint: use Theorem 10.9 and Exercise 
10.17. 

(b) Similarly, show that if g3(L) = 0, then L is a multiple of [1,7]. 

(c) If L is a lattice with gg3 # 0, then show that there is a nonzero 
complex number A such that for some g € C, A~4g. = 20g and 
A~°g3 = 28g. Hint: use (10.10). 

(d) Verify the computations made in (10.22). 


10.20. Show that j((1 + V—7)/2) = —3375. 


§11. MODULAR FUNCTIONS AND RING CLASS FIELDS 


In §10 we studied complex multiplication, and we saw that for an order O 
in an imaginary quadratic field, the j-invariant j(a) of a proper fractional 
O-ideal a is an algebraic number. This suggests a strong connection with 
number theory, and the goal of §11 is to unravel this connection by relating 
j(a) to the ring class field of O introduced in §9. The precise statement of 
this relation is the “First Main Theorem” of complex multiplication, which 
is the main result of this section: 
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Theorem 11.1. Let O be an order in an imaginary quadratic field K, and 
let a be a proper fractional O-ideal. Then the j-invariant j(a) is an algebraic 
integer and K(j(a)) is the ring class field of the order O. 


For a fixed order O, we will prove in §13 that the j(a)’s are all conju- 
gate and hence are roots of the same irreducible polynomial over Q. This 
polynomial is called the class equation of O and will be studied in detail in 
§13. 

Of special interest is the case when O = Z[,\/—a]. Here, Theorem 11.1 
implies that j(O) = j(\/—n) is an algebraic integer and is a primitive ele- 
ment of the ring class field of Z[,/—n]. It is elementary to see that j(,/—7) 
is real (see Exercise 11.1), and thus, by Theorem 9.2, the class equation of 
Z[,\/—n] can be used to characterize primes of the form p = x* + ny’. 

Before we can prove Theorem 11.1, we need to learn about modular 
functions and the modular equation. The first step is to study the j-function 
J(T) in detail. 


A. The j-Function 


The j-invariant j(L) of a lattice L was defined in 810 in terms of the con- 
stants g2(L) and g3(L). Given 7 in the upper half plane 6, we get the lattice 
[1,7], and then the j-function j(7) is defined by 


i(7) = JLT). 
We also define go(7) and g3(7T) by 


oot 
fo) = sr) = a (m+ntyi 
g(r) = gs(lly7) = M00’ ae, 


m,n 


U : . . 
where )_,,,,, denotes summation over all ordered pairs of integers (m,n) # 


(0,0). By (10.8), it follows that j(7) is given by the formula 
| 82(T)° 
=17 
J(T) 28 A(T) ? 


where A(T) = g2(7T)? — 27g3(T)°. 

The properties of j(7) are closely related to the action of the group 
SL(2,Z) on the upper half plane §. This action is defined as follows: if 
z€h and y= (55) € SL(2,Z), then 


aT+b 
cT +d- 
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It is easy to check that yr € ) (see Exercise 11.2), and we say that yr and 
7 are SL(2, Z)-equivalent. Then the /-function has the following properties: 


Theorem 11.2. 
(i) j(7) ts a holomorphic function on b. 
(ii) If tT and T' lie in b, then j(r) = j(t') if and only if t' = y7 for some 
7 € SL(2,Z). In particular, j(T) is SL(2, Z)-invariant. 
(iii) 7:5 — C is surjective. 
(iv) Fort Eb, j'(7) #0, except in the following cases: 
(a) T=, y € SL(2,Z), where j'(r) = 0 but j"(r) #0. 
(b) T=yw, w=e?"/3, yESL(2,Z), where j'(T) = j"(T)=0 but 
inn) #0. 


Proof. To prove (i), recall from Proposition 10.7 that A(7) never vanishes. 
Thus it suffices to show that g2(7) and g3(7) are holomorphic. For g2(7), 
this works as follows. By Lemma 10.2, the sum defining g2(7) converges ab- 
solutely, but we still must show that the convergence is uniform on compact 
subsets of h. To see this, first note that g2(7 + 1) = g2(7) (this follows from 
absolute convergence). Thus it suffices to show that convergence is uniform 
when 7 satisfies |Re(7)| < 1/2 and Im(7T) > €, where € < 1 is an arbitrary 
positive number. In this case it is easy to show that 


€ 
|m+nt|> 5Vm +n? 


(see Exercise 11.3), and then uniform convergence is immediate. The proof 
for g3(T) is similar, so that g2(7), g3(7), A(7) and j(7) are all holomorphic 
on bh. 

Turning to (ii), we need to recall the following fact from in §7: if 7,7’ € 5, 
then 


[1,7] and [1,7'] are homothetic <=> 1r' = y7 for some ¥ € SL(2,Z). 
See (7.8) for the proof (in §7, we assumed that 7 and 7’ lay in an imaginary 


quadratic field, but the proof given for (7.8) holds for arbitrary 7,7' € h). 
From Theorem 10.9, we also know that 


j(t) = j(7') <> [1,7] and [1,7'] are homothetic. 
Combining these two equivalences, (ii) is immediate. 


Before we can prove (ili), we need to compute the limits of g2(7) and 
83(T) as Im(T) — oo. To study g2(7), write 


= a ee eri gt eo 
§2(T) = 60) (m+nrTy sat 2», m4 ss » (m+ nt) 
m,n m=1 


m,n=—oo 


n#0 
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Using the uniform convergence proved in (1), we see that 


ae 
lim _82(7) = 120) me? 
m=1 


Im(T)— 
and then the well known formula 5>>_,1/m* =72*/90 (see Serre [88, 
§ VII.4.1]) implies that 


4 
lim = _7' 
ieee Cs 5 


The case of g3(T) is similar. Here, the key formula is )>°°_, 1/m®° = 1°/945 
(see Serre [88, §VII.4.1]), and one obtains 


- 
li =e 
nl 880) = 39 


These limits imply that 


3 2 
4 8 
lim <A n+) —27 = 0, 
ime oe Ne G ) ea ; 


and it follows easily that 


(11.3) lim j(r)= 


Im(T)— oo 


We will also need the following lemma: 


Lemma 11.4. Every 7 € 6 is SL(2, Z)-equivalent to a point T' which satisfies 
[Re(7’)| < 1/2 and Im(r’) > 1/2. 


Proof. If Im(7) > 1/2, then there is an integer m such that 7’ = 7 + m sat- 
isfies [Re(r’)| < 1/2 and Im(r‘) > 1/2. Since rt’ =7 +m = (,7)7, we are 
done in this case. 

If Im(7) < 1/2, then by the argument of the previous paragraph, we can 
assume |Re(T)| < 1/2. It follows that |7| < 1/V2, so that 


Im (=) EN) oie) 


T |r|? 


_ 


Since —1/T = e Pane we can more than double the imaginary part of 7 by 
using an element of SL(2,Z). Repeating this process as often as necessary, 
we must eventually obtain a SL(2, Z)-equivalent point 7’ € 5 which satisfies 
Im(r') > 1/2. Q.E.D. 


This lemma is related to the idea of finding a fundamental domain for 
the action of SL(2,Z) on h. We won’t use this concept in the text, but there 
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is an interesting relation between fundamental domains and reduced forms 
(in the sense of Theorem 2.8). See Exercise 11.4 for the details. 

We can now show that the j-function is surjective. Since it’s holomor- 
phic and nonconstant, its image is an open subset of C. If we can show that 
the image 1s closed, surjectivity will follow. So take a sequence of points 
J(™%) which converges to some w € C. We need to show that w = j(7) for 
some T € h. By Lemma 11.4, we can assume that each 7, Hes in the region 
R= {rT €6:|Re(7)| < 1/2,|Im(r)| > 1/2}. If the imaginary parts of the 7;’s 
were unbounded, then by the limit (11.3), the j(7,)’s would have a sub- 
sequence which converged to oo. This is clearly impossible. But once the 
imaginary parts are bounded, the 7;’s lie in a compact subset of h. Then 
they have a subsequence converging to some 7 € h, and it follows by conti- 
nuity that j(7) = w, as desired. 

The proof of (iv) will use the following lemma: 


Lemma 11.5. Jf 7,7' € b, then there exist neighborhoods U of t and V of T' 
such that the set {y € SL(2,Z): yU)NV # O} is finite. 


Proof. This lemma says that SL(2,Z) acts properly discontinuously on 6, 
and the proof is given in Exercise 11.5. Q.E.D. 


Corollary 11.6. Jf 7 € 6, then r has a neighborhood U such that for all 
7 € SL(2,Z), 
Y(U)NU AO = 7T = T. 


Proof. See Exercise 11.5. Q.E.D. 


Now suppose that j'(7) = 0. Then 7 has a neighborhood U such that for 
w sufficiently close to j(7), there are 7’ #7" € U such that j(7') = j(7”) = 
w. By (ii), 7” = y7' for some y # £/, where J = (j$). Thus (U)NU #0. 
By shrinking U and using Corollary 11.6, it follows that yr =7, y # HI. 
This is a very strong restriction on 7. To see why, let y = (@ 2 . Then yr = 
T implies that 
[1,7] = (er + d)[1,7] 


(see the proof of (7.8)), and since y # +/, an easy argument shows that 
c #0 (see Exercise 11.6). Thus a=c7t+d¢Z, so that by Theorem 10.14, 
the lattice [1,7] has complex multiplication by an order © in an imaginary 
quadratic field. Furthermore, a[1,7] = [1,7] implies that a € O*. However, 
we know that O* = {+1} unless O = Ox for K = Q(i) or Q(w), w = e™/? 
(see Exercise 11.6). Both of these orders have class number 1, so that 
[1,7] is homothetic to either [1,7] or [1,w]. Thus j’(7) = 0 implies that 7 
is SL(2, Z)-equivalent to either 7 or w. 
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When 7 is SL(2,Z)-equivalent to i, we may assume that 7 =7, and we 
need to show that j’(i) = 0 and j"(i) # 0. To prove the former, note that 


= 2793(7)° 
In §10 we proved that g3(7) =0, and j'(i) =0 follows immediately. Now 
suppose that j"(i) =0. Then z is at least a triple zero of j(7) — 1728, so 
that for w sufficiently near 1728, there are distinct points 7, 7’ and 7” near 
i such that j(7) = j(7’) = j(7") = w. Then 7! = y17, 7" = 727, where +I, 
+7, and +7 are all distinct elements of SL(2,Z). By Corollary 11.6, yi = 
zi = 1, so that at least 6 elements of SL(2,Z) fix i. Since only 4 elements 
of SL(2,Z) fix 1 (see Exercise 11.6), we see that j(i) # 0. The case when 
T =w is similar and is left to the reader (see Exercise 11.6). Theorem 11.2 
is proved. Q.E.D. 


The surjectivity of the j-function implies the following result which was 
used in 810: 


Corollary 11.7. Let g2 and g3 be arbitrary complex numbers such that g3 — 
2792 #0. Then there is a lattice L such that g2(L) = g2 and g3(L) = g3. 


Proof. Since the j-function is surjective and g}? — 27g? #0, there is some 
T € bh such that 


g 
j(r) = 1728-2... 
) g3 — 27g? 


Arguing as in the proof of (10.11), this equation implies that there is a 
nonzero complex number A such that 


82 = A~“go(T) 
g3 = A~°g3(7). 
Using (10.10), it follows that L = A[1,7] is the desired lattice. Q.E.D. 


Since j(7T) is invariant under SL(2,Z), we see that 


i+=i(( 5)7) =i. 


This implies that j(7) is a holomorphic function in g = q(7) = e27'7, de- 
fined in the region 0 < |qg| < 1. Consequently j(7) has a Laurent expansion 
I(T) = > Cnq’ 


n=—0o 
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which is called the q-expansion of j(T). The following theorem will be used 
often in what follows: 


Theorem 11.8. The g-expansion of j(T) ts 


MQ [Re 


(7) = Z + 744 + 196884q +--- = 
q 


OO 
+ > eng", 
n=0 
where the coefficients cy, are integers for all n> 0. 


Proof. We will prove this in §12 using the Weber functions and the Weier- 
strass g-function. More standard proofs may be found in Apostol [1, 81.15] 
or Lang [73, §4.1]. Q.E.D. 


This theorem is the reason that the factor 1728 appears in the definition 
of the j-invariant: it’s exactly the factor needed to guarantee that all of the 
coefficients of the g-expansion are integers without any common divisor. 


B. Modular Functions for Io(m) 


One can define modular functions for any subgroup of SL(2,Z), but we 
will concentrate on the subgroups Ip(m) of SL(2,Z), which are defined as 
follows: if m is a positive integer, then 


Ty(m) = (¢ ) € SL(2,Z):c =0 mod mh. 


Note that [9(1) = SL(2, Z). Then a modular function for [9(m) is a complex- 
valued function f(7) defined on the upper half plane 5, except for isolated 
singularities, which satisfies the following three conditions: 

(1) f(T) is meromorphic on 6. 

(ii) f(7) is invariant under Io(m). 
(iii) f (7) is meromorphic at the cusps. 
By (11), we mean that f(y7) = f(T) for all 7 € 6 and y € Ip(m). To explain 
(iii), Some more work is needed. Suppose that f(7) satisfies (i) and (ii), and 
that y € SL(2,Z). We claim that f(y7) has period m. To see this, note that 
7 +m=Urt, where U = (57). An easy calculation shows that yUy7! € 
Io(m), and we then obtain 


fr + m)) = f(U7) = f(qUy"97) = f 97) 


since f(T) is [g(m)-invariant. It follows that if g = q(T) = e?™7, then f(T) 
is a holomorphic function in q!/”, defined for 0 < |q!/"| <1. Thus f(y7) 
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has a Laurent expansion 


faor)= © ang", 


Nn=—co 


which by abuse of notation we will call the q-expansion of f(77T). Then 
f(T) 1s meromorphic at the cusps if for all y € SL(2,Z), the q-expansion of 
f (qT) has only finitely many nonzero coefficients for negative exponents. 

The basic example of a such a function is given by j(7). It is holomorphic 
on 6, invariant under SL(2,Z), and Theorem 11.8 implies that it is mero- 
morphic at the cusps. Thus j(7) is a modular function for SL(2,Z) = Ip(1). 
The remarkable fact is that modular functions for both SL(2,Z) and Io(m) 
are easily described in terms of the j-function: 


Theorem 11.9. Let m be a positive integer. 

(1) j(7) ts @ modular function for SL(2,Z), and every modular function for 
SL(2, Z) is a rational function in j(T). 

(ii) J(T) and j(mr) are modular functions for To(m), and every modular 
function for T9(m) ts a rational function of j(T) and j(mr). 


Proof. Note that (i) is a special case of (ii). It is stated separately not only 
because of its independent interest, but also because it’s what we must 
prove first. 

Before beginning the proof, let’s make a comment about q-expansions. 
Our definition requires checking the g-expansion of f(y7) for all y€ 
SL(2,Z). Since f(7) is [g(m)-invariant, we actually need only consider the 
q-expansions of f(7,7), where the y,’s are right coset representatives of 
Ip(m) C SL(2.Z). So there are only finitely many q-expansions to check. 
The nicest case is when f(7) is a modular function for SL(2,Z), for here 
we need only consider the g-expansion of f(T). 

We can now prove (i). We've seen that j(7) is a modular function for 
SL(2,Z), so we need only show that every modular function f(7) for 
SL(2,Z) is a rational function in j(7). We will begin by studying some spe- 
cial cases. We say that a modular function f(7) is holomorphic at oo if its 
q-expansion involves only nonnegative powers of q. 


Lemma 11.19. 

(1) A holomorphic modular function for SL(2,Z) which is holomorphic at 
oo 1s constant. 

(1) A holomorphic modular function for SL(2.Z) is a polynomial in j(T). 


Proof. To prove (i), let f(7) be the modular function in question. Since 
f(7) is holomorphic at oo, we know that f(oc) = liminyry—oo f(T) exists 
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as a complex number. We will show that f(hU {oo}) is compact. By the 
maximum modulus principle, this will imply that f(7) is constant. 

Let f (7%) be a sequence of points in the image. We need to find a sub- 
sequence that converges to a point of the form f(7) for some 7 € . Since 
f(r) is SL(,Z)-invariant, we can assume that the 7;’s lie in the region 
R= {rt €h:|Re(7)| < 1/2, |Im(7)| > 1/2} (see Lemma 11.4). If the imag- 
inary parts of the 7%’s are unbounded, then by the above limit, a subse- 
quence converges to f (oo). If the imaginary parts are bounded, then the 
Ts lie in a compact subset of h, and the desired subsequence is easily 
found. This proves (i). 

Turning to (ii), let f(7) be a holomorphic modular function for SL(2, Z). 
Its q-expansion has only finitely many terms with negative powers of q. 
Since the q-expansion of j(7) begins with 1/q, one can find a polynomial 
A(x) such that f(7)— A(j(7)) is holomorphic at oo. Since it is also holo- 
morphic on h, it is constant by (i). Thus f(7) is a polynomial in j(7), and 
the lemma is proved. Q.E.D. 


To treat the general case, let f(7) be an arbitrary modular function for 
SL(2,Z), possibly with poles on §. If we can find a polynomial B(x) such 
that B(j(7))f (7) is holomorphic on , then the lemma will imply that f (7) 
is a rational function in j(7). Since f(7) has a meromorphic q-expansion, 
it follows that f(7) has only finitely many poles in the region R = {7 € h: 
|[Re(7)| < 1/2, |Im(7)| > 1/2}, and since f(7) is SL(2,Z)-invariant, Lemma 
11.4 implies that every pole of f(7) is SL(2,Z)-equivalent to one in R. 
Thus, if B(j(7))f (7) has no poles in R, then it is holomorphic on h. 

So suppose that f(T) has a pole of order m at MER. If j'(70) #0, 
then (j(7) — j(7))"f (7) is holomorphic at 7). In this way we can find a 
polynomial B(x) such that B(j(7))f(7) has no poles in R, except possibly 
for those where j’(7) = 0. When this happens, part (iv) of Theorem 11.2 
allows us to assume that 7 =i or w = e2™/3, When 7) = i, we claim that m 
is even. To see this, note that in a neighborhood of 7, f(7) can be written 
in the form 


_ _ &(7) 
7Q)= (7 —iy”” 
where g(7) is holomorphic and g(i) # 0. Now (_/5) € SL(2,Z) fixes i, so 
that 
&(-1/T) 


f(T) =J(—1/7) ad (—1/T —i)"" 


Comparing these two expressions for f (7), we see that 


g(-1/T) = aayre(): 
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Evaluating this at 7 =i implies that g(7) = (—1)"g(z), and since g(z) # 0, it 
follows that m is even. By Theorem 11.2, j(7) — 1728 has a zero of order 2 
at i, and hence (j(T) — 1728)"/*f (7) is holomorphic at i. The argument for 
T = w is similar and is left to the reader (see Exercise 11.7). This completes 
the proof of part (i) of Theorem 11.9. 

‘To prove part (ii), it is trivial to show that j(7) is a modular function for 
Ip(m). As for j(m7), it is certainly holomorphic, and to check its invariance 
properties, let y = (¢ 4) € Ip(m). Then 


jem) = | (™P) =i (Seem): 


cr +d c/m-mtT+d 
Since 7 € To(m), it follows that 7’ = (.7,, om) € SL(2,Z). Thus 


j(myT) = j(y'mr) = j(mr7), 


which proves that j(m7) is [o(m)-invariant. 
In order to show that j(m7T) is meromorphic at the cusps, we first relate 
To(m) to the set of matrices 


b 
C(m) = (6 ) :ad=m,a>0,0<b<d, gcd(a,b,d) = i\. 


The matrix 09 = (4 a € C(m) has two properties of interest: first, 97 = 
mT, and second, 
To(m) = (a5 'SL(2,Z)o0) NSL(2,Z) 


(see Exercise 11.8). Note that these two properties account for the Ip(m)- 
invariance of j(m7T) proved above. More generally, we have the following 
lemma: 


Lemma 11.11. For 0 € C(m), the set 
(09 SL(2,Z)o) NSL(2,Z) 


is a right coset of To(m) in SL(2,Z). This induces a one-to-one correspon- 
dence between right cosets of I9(m) and elements of C(m). 


Proof. See Exercise 11.8. Q.E.D. 


This lemma implies that [SL(2,Z): Ig(m)] = |C(m)|. One can also com- 
pute the number of elements in C(m): one gets the formula: 


Ic(m)| = mT] (1+ =) 


p|m 
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(see Exercise 11.9), and thus the index of [p(m) is SL(2,Z) is m[],,),,(1 + 
1/p). 

We can now compute some q-expansions. Fix y € SL(2,Z), and choose 
a € C(m) so that y lies in the right coset corresponding to ao in Lemma 
11.11. This means that ogy = yo for some 7 € SL(2,Z), and hence j(myrT) 
= jJ(o0yT) = J(VOT) = j(OT) since j(7) is SL(2, Z)-invariant. Hence 


(11.12) j(myT) = j(or). 


Suppose that ¢ = (4 2) We know from Theorem 11.8 that the g-expansion 
of j(T) is 


1 << 
j@)= rap ae ¢, 6 2, 
n=0 


and since at = (aT + b)/d, it follows that 


2mi(at +b)/d mea gel 


q(aT) =e =e 


If we set ¢,, = e?™/™, we can write this as 


q(aT) = C2 (qi) 


since ad = m. This gives us the ait 
(11713) J(MyT) = J(OT) = Bos + Yenc glimyrn. c, EZ. 


There are Only finitely many negative exponents, which shows that j(mrT) is 
meromorphic at the cusps, and thus j(mr) is a modular function for Ig(m). 

The next step is to introduce the modular equation ®,(X,Y). Let the 
right cosets of Ig(m) in SL(2,Z) be Ip(m)y,, i = 1,...,|C(m)|. Then con- 
sider the polynomial in X 


IC(m)| 


@n(X,7) = [] (X -j(mn7)). 
1=1 


We will prove that this expression is a polynomial in X and j(7T). To see 
this, consider the coefficients of ®,,(X,7). Being symmetric polynomials in 
the j(my,7)’s, they are certainly holomorphic. To check invariance under 
SL(2,Z), pick 7 € SL(2,Z). Then the cosets Ig(m)y,7 are a permutation of 
the Ig(m)y,’s, and since j(mr) is invariant under Ip(m), the j(my,yT)’s are 
a permutation of the j(my,7T)’s. This shows that the coefficients of ®,,(X,7) 
are invariant under SL(2,Z). 

We next have to show that the coefficients are meromorphic at infin- 
ity. Rather than expand in powers of q, it suffices to expand in terms 
of q!/™ = e?'T/m and show that only finitely negative exponents appear. 
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By (11.12), we know that j(m7,T) = j(@7) for some o € C(m), and then 
(11.13) shows that the g-expansion for j(my,7) has only finitely many nega- 
tive exponents. Since the coefficients are polynomials in the j(my,T)’s, they 
clearly are meromorphic at the cusps. 

This proves that the coefficients of ®,,(X,7) are holomorphic modular 
functions, and thus, by Lemma 11.10, they are polynomials in j(7). This 
means that there 1s a polynomial 


On(X,Y) € C[X,Y] 


such that 

|C(m)| 
(11.14) Pm(XH(7)) = TT (X -i(mn)), 

1=1 
The equation 6,,(.X,Y) = 0 is called the modular equation, and by abuse 
of terminology we will call @,,(X,Y) the modular equation. Using some 
simple field theory, it can be proved that 6,,(X,Y) is irreducible as a poly- 
nomial in X (see Exercise 11.10). 

By (11.12), each j(my,T) can be written j(o7) for a unique o € C(m). 

Thus we can also express the modular equation in the form 


(11.15) On(X,i(7)) = [] (% -i(7)). 
aEC(m) 


Note that j(mr) is always one of the j(aT)’s since (”") € C(m). Hence 
Pn(J(m7), (7) = 9, 
which is one of the important properties of the modular equation. Note that 


the degree of ®,,(X,Y) in X is |C(m)|, which we know equals m[] (1+ 


1/p). 
Now let f(7) be an arbitrary modular function for [9(m). To prove that 


f (7) is a rational function in j(7) and j(mr), consider the function 


p\m 


IC(m)| 


= f(nT) 
G(X,7) = On(X,J(7)) d Cr 
(11.16) 
|C(m)| 
= So fut) [ [CX - i@m7,7)). 
1=1 JF 


This is a polynomial in X, and we claim that its coefficients are modular 
functions for SL(2,Z). The proof is similar to what we did for the modular 
equation, and the details are left to the reader (see Exercise 11.11). But 
once the coefficients are modular functions for SL(2,Z), they are rational 
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functions of j(7) by what we proved above. Hence G(.X,7) is a polynomial 


G(X, (7) € CUT))X]. 
We can assume that 7; is the identity matrix. By the product rule, we 
obtain 


oo (j(mr) i(r)) = [[ or) — 0m 
J 


Thus, substituting X = j(mr) in (11.16) gives 


; OP mn. ; 
Gilmer), jr) = FSS" lem), (7), 


Now ©,,(X,j(7)) is irreducible (see Exercise 11.10) and hence separable, 
so that (0/0X)®m(j(m7), j(7)) # 0. Thus we can write 


G(j(mT), j(7)) 
O m,. : : 
ax mr), 17) 


Ai17) f= 


which proves that f(7) is a rational function in j(7) and j(mr). This com- 
pletes the proof of Theorem 11.9. Q.E.D. 


There is a large literature on modular functions, and the reader may 
wish to consult Apostol [1], Koblitz [67], Lang [73] or Shimura [90] to learn 
more about these remarkable functions. 


C. The Modular Equation 4,,(X,Y) 


The modular equation, as defined by equations (11.14) or (11.15), will play 
a crucial role in what follows. In particular, we will make heavy use of 
the arithmetic properties of @,,(.X,Y), which are given in the following 
theorem: 


Theorem 11.18. Let m be a positive integer. 
(i) ®,(X,Y) € ZLX,Y]. 
(ii) ®m( X,Y) is trreducible when regarded as a polynomial in X. 
(ili) Om(X,Y) = Omn(Y,X) if m> 1. 
(iv) If m ts not a perfect square, then ®,,(X,X) ts a polynomial of degree 
> 1 whose leading coefficient is +1. 
(v) If mis a prime p, then ®,(X,Y)=(X? —Y)(X —Y?) mod pZ[X,Y]. 


Proof. To prove (1), it suffices to show that an elementary symmetric func- 
tion f(7) in the j(aT)’s, 7 € C(m), is a polynomial in j(7T) with integer 
coefficients. We begin by studying the q-expansion of f(7) in more detail. 
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Let ¢,, = e?™/™, By (11.13), each j(a7T) lies in the field of formal meromor- 
phic Laurent series Q(¢,,)((q!/™)), and since f(7) is an integer polynomial 
in the j(o7)’s, f(T) also lies in Q(¢,)((q'/”)). 

We claim that f(7) is contained in the smaller field Q((q'/™)). To see 
this, we will use Galois theory. An automorphism w € Gal(Q(G,,)/Q) de- 
termines an automorphism of Q(¢,,)((q!/™)) by acting on the coefficients. 
Given o = (4 io € C(m), let’s see how yw affects j(o7). We know that w(¢,,) 
= (* for some integer k relatively prime to m, and from (11.13), it follows 
that 


: Om = a n m an 
~PUlT)) = quine ty Ge a) 
n=0 


since all of the c,,’s are integers. Let b’ be the unique integer 0< b'<d 
such that b' = bk mod d. Since ad = m, we have (2°* = ¢2°', and conse- 


quently the above formula can be written 


—ab' 
~UOT)) = Ca + Sn Giem neghimya'n 


n=0 
If we let og’ = Gavan then a’ € C(m), and (11.13) implies that 
P(J(OT)) = JOT). 


Thus the elements of Gal(Q(¢,,)/Q) permute the j(a7)’s. Since f(7) 1s sym- 
metric in the j(a7T)’s, it follows that f(r) € Q((q!/”)). 

We conclude that f(7) € Z((q)) since the g-expansion of f(7) involves 
only integral powers of q and the coefficients of the q-expansion are al- 
gebraic integers. It remains to show that f(T) is an integer polynomial in 
J(T). By Lemma 11.10, we can find A(X) € C[X] such that f(r) = A(j(7)). 
Recall from the proof of Lemma 11.10 that A(X) was chosen so that the 
q-expansion of f(7)— A(y(7T)) has only terms of degree > 0. Since the ex- 
pansions of f(7) and j(7) have integer coefficients and j(7) = 1/q+---, it 
follows that A(X) € Z[X]. Thus f(7) = A(j(7)) is an integer polynomial in 
J(T), and (i) is proved. 

We should mention that the passage from the coefficients of the q- 
expansion to the coefficients of the polynomial A(X) is a special case of 
Hasse’s qg-expansion principle—see Exercise 11.12 for a precise formula- 
tion. 

A proof of (ii) is given in Exercise 11.10, and a proof of (il) may be 
found in Lang [73, §5.2, Theorem 3]. 

Turning to (iv), assume that m is not a square. We want to study the 
leading term of the integer polynomial ®,,(X,X). Replacing X with j(7), 
it suffices to study the coefficient of the most negative nee of q in the 
q-expansion of %,,(/(7),/(7)). However, given ¢ = ce) e-ClM) 111.13) 
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tells us that 


ab Saad 
(11.19) i(t) - jor) = _—2 oe La (qiimy 


for some coefficients d,. Since m is not a perfect square, we know that 
a#d, i.e. a/d #1. Thus the coefficient of the most negative term in 
(11.19) is a root of unity. By (11.15), ®n(j(7),J(7)) is the product of the 
factors (11.19), so that the coefficient of the most negative power of gq in 
®n(J(T),J(7)) is also a root of unity. But this coefficient is an integer, and 
thus it must be +1, as claimed. 

Finally, we turn to (v). Here, we are assuming that m = p, where p is 
prime. Let ¢, = e°7'/?. We will use the following notation: given f(7) and 


g(7) in Z[¢,]((qi/?)) and a € Z[¢,], we will write 
f(T) = 8(7) mod a 


to indicate that f(r) —g(T) € aZ[¢,]((q'/")). 
Since p is prime, the elements af C(p) are easy to write down: 


(; ) —, ‘ 
a ; i=0,...,p— 
O p P 


v=(9 4) 
or ty? 
If 0 <i < p-—1, then (11.13) tells us that 
Uy 1 1 n 
i(aT) = 2, + oe (qr) = we + Yen /P)" mod 1-6, 
which implies that 
(11.20) J(6.7) = Jor) mod 1—<, 
for 0 <i< p—1. Turning to j(ap7), here (11.13) tells us that 


; 1 = nA 
ier) = ap + Leena 


and since ch =c, mod p, it follows easily that 
J(OpT) = J(r)? mod p. 


Since 1 — Cp divides p in Z[¢,,] (see Exercise 11.13), the above congruence 
can be written 


(11.21) JGp7T) SIG)’ mod lc, . 


234 $11 MODULAR FUNCTIONS AND RING CLASS FIELDS 


Then (11.20) and (11.21) imply that 


P 
©,(X, j(7)) = [](X - i(a7)) 


1=0 
= (X — j(a0r))?(X — j(7)’) mod 1-¢, 
= (XP — j(a0T)?)(X — j(7)”) mod 1—¢,, 
where we are now working in the ring Z(¢,]((q'/?))[X]. However, the ar- 
gument used to prove (11.21) is easily adapted to prove that 
J(T) = J(0T)? mod 1- Ce 
(see Exercise 11.14), and then we obtain 
®p(X, j(7)) = (X? — j(7)(X — j(7)?) mod 1—¢,. 


The two sides of this congruence lie in Z((q))[X], so that the coefficients 
of the difference are ordinary integers divisible by 1—(¢,, in the ring Z[{¢,]. 
This implies that all of the coefficients are divisible by p (see Exercise 
11.13), and thus 


® p(X, j(T)) = (KX? — j(7))(X — j(7)”) mod pZ((q))[X]. 
Then the Hasse g-expansion principle (used in the proof of (1)) shows that 
@,(X,Y) =(X? —Y)(X —Y") mod pZ[X,Y], 


as desired (see Exercise 11.15). The above congruence was first discovered 
by Kronecker (in a slightly different context) and is sometimes called Kro- 
necker’s congruence. This completes the proof of Theorem 11.18. Q.E.D. 


The properties of the modular equation are straightforward conse- 
quences of the properties of the j-function, which makes the modular equa- 
tion seem like a reasonable object to deal with. This is true as long as one 
works at the abstract level, but as soon as one asks for concrete exam- 
ples, the situation gets surprisingly complicated. For example, when m = 3, 
Smith [94] showed that $3(.X,Y) is the polynomial 


(11.22) 
XOXO Dae SY VY 42" 3-5) 


— X7Y3 + 23.37. 31X7V°(¥ +Y) 
— 27.33. 9907XY(X* + Y”) + 2-37. 13-193-6367X7Y" 
+ 2! .3°.53.17-263KY(X + Y)— 27! .5°.22973XY. 
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The modular equation ©,,(X,Y) has been computed for m =5, 7 and 11 
(see Hermann [53] and Kaltofen and Yui [66]), and in 813 we will discuss 
the problem of computing ®»,,(X,Y) for general m. 

Before we can apply the modular equation to complex multiplication, 
one task remains: we need to understand the modular equation in terms of 
j-invariants of lattices. The basic idea is that if L is a lattice, then the roots 
of ®,(X,j/(L)) =0 are given by the j-invariants of those sublattices L’ C 
L which satisfy: 

(i) L' is a sublattice of index m in L, ie., [L: L'] =m. 

(ii) The quotient L/L’ is a cyclic group. 

In this situation, we say that L’ is a cyclic sublattice of L of index m. Here 
is the precise statement of what we want to prove: 


Theorem 11.23. Let m be a positive integer. If u,v € C, then ®»(u,v) = 0if 
and only if there is a lattice L and a cyclic sublattice L' C L of index m such 
that u = j(L') and v = j(L). 


Proof. We will first study the cyclic sublattices of the lattice [1,7], 7 € bh: 


Lemma 11.24. Let 7 € h, and consider the lattice [1,7]. 

(i) Given a cyclic sublattice L' Cc [1,T] of index m, there is a unique 0 = 
(6°) €C(m) such that L' = d{1,o7]. 

(ii) Conversely, if ¢ = (45) € C(m), then d[1,o7] is a cyclic sublattice of 
[1,7] of index m. 


Proof. First recall that C(m) is the set of matrices 


0d 


A sublattice L’ c L = [1,7] can be written L' = [at + b,cT + d], and in Ex- 
ercise 7.15 we proved that [L: L'] = |ad —bc| =m. Furthermore, a stan- 
dard argument using elementary divisors shows that 


(11.25) L/L’ is cyclic <=> gcd(a,b,c,d) = 1 


a b 
com) = { ( ) sad =m, a>0, 05b<d, gcd(a,byd) = 1}. 


(see, for example, Lang [73, pp. 51-52]). Another proof of (11.25) is given 
in Exercise 11.16. 

Now suppose that L’ Cc [1,7] is cyclic of index m. If d is the smallest 
positive integer contained in L’, then it follows easily that L’ is of the form 
L' = [d,at + b] (see Exercise 11.17). We may assume that a > 0, and then 
ad = m. However, if k is any integer, then 


L' = [d,(at + b)+ kd] =[d,ar+(b+kd)], 
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so that by choosing k appropriately, we can assume 0 < b < d. We also have 
gced(a, b,d) = 1 by (11.25), and thus the matrix ¢ = (4 2) lies in C(m). Then 


L' = [d,at + b] = d[1,(at + b)/d] = d[1,o7] 


shows that L' has the desired form. It is straightforward to prove that o € 
C(m) is uniquely determined by L’ (see Exercise 11.17), and (i) is proved. 
The proof of (ii) follows immediately from (11.25), and we are done. 
Q.E.D. 


By this lemma, the j-invariants of the cyclic sublattices L’ of index m of 
[1,7] are given by 


i(L') = j(4[1,07]) = j((1,07]) = 77). 


By (11.15), it follows that the roots of ®,,(X,j(7)) =0 are exactly the j- 
invariants of the cyclic sublattices of index m of [1,7]. It is now easy to 
complete the proof of Theorem 11.23 (see Exercise 11.18 for the details). 

Q.E.D. 


D. Complex Multiplication and Ring Class Fields 


To prove Theorem 11.1, we will apply the modular equation to lattices with 
complex multiplication. The key point is that such lattices have some es- 
pecially interesting cyclic sublattices. To construct these sublattices, we will 
use the notion of a primitive ideal. Given an order O, we say that a proper 
O-ideal is primitive if it is not of the form da where d > 1 is an integer 
and a is a proper O-ideal. Then primitive ideals give us cyclic sublattices as 
follows: 


Lemma 11.26. Let O be an order in an imaginary quadratic field, and let 
b be a proper fractional O-ideal. Then, given a proper O-ideal a, ab is a 
sublattice of b of index N(a), and ab is a cyclic sublattice if and only if a is 
a primitive ideal. 


Proof. Replacing 6 by a multiple, we can assume that b C O. Then the 
exact sequence 


0 — b/ab —+ O/ab —+ O/b —-0 


implies that [b: ab]N(b) = N(ab) = N(a)N(6), and [b: ab] = N(a) fol- 
lows. 

Now assume that b/ab is not cyclic. By part (a) of Exercise 11.16, it 
follows that b/ab contains a subgroup isomorphic to (Z/dZ)* for some d > 
1, so that there is a sublattice ab C b’ C b such that b’/ab ~ (Z/dZ)*. Since 
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b’ is rank 2, this implies that ab = db’, and then a = db’b~!. But b’b-!c 
O since b’ C b, which shows that a is not primitive. 

The converse, that a not primitive implies 6/ab not cyclic, is even easier 
to prove, and is left to the reader (see Exercise 11.19). This completes the 
proof of the lemma. Q.E.D. 


When we apply this lemma, a will often be a principal ideal a = aO, 
a € QO. In this case, aO is primitive as an ideal if and only if @ is primitive 
as an element of O (which means that a is not of the form d6 where d > 1 
and 8 € QO). Since N(a) = N(aQ) by Lemma 7.14, we get the following 
corollary of Lemma 11.26: 


Corollary 11.27. Let O and b be as above. Then, given a€O, ab is a 
sublattice of 6 of index N(a), and ab is a cyclic sublattice if and only if a is 
primitive. Q.E.D. 


We are now ready to prove Theorem 11.1, the “First Main Theorem” of 
complex multiplication. 


Proof of Theorem 11.1 Let a be a proper fractional O-ideal, where O 
is an order in an imaginary quadratic field K. We must prove that j(a) is 
an algebraic integer and that K(j(a)) is the ring class field of O. We will 
follow the proof given by Deuring in [24, §10]. 

Let’s first use the modular equation to prove that j(a) is an algebraic 
integer. The basic idea is quite simple: let a € O be primitive so that by the 
above corollary, aa is a cyclic sublattice of a of index m = N(a). Then, by 
Theorem 11.23, we know that 


0 = On(j(aa), j(a)) = Pm(y(a), J(4)) = 0 


since j(aa) = j(a). Thus j(a) is a root of the polynomial ©,,(X,X). Since 
®,(X,Y) has integer coefficients (part (i) of Theorem 11.18), this shows 
that j(a) is an algebraic number. Furthermore, if we can pick a so that m = 
N(q) is not a perfect square, then the leading coefficient of 6,(X,X) is 
+1 (part (iv) of Theorem 11.18), and thus j(a) will be an algebraic integer. 
So can we find a primitive a € O such that N(q@) is not a perfect square? 
We will see below in (11.28) that O has lots of a’s such that N(a@) is prime. 
Such an @ is certainly primitive of nonsquare norm. For a more elementary 
proof, let f be the conductor of O. By Lemma 7.2, O =[1,fwx], wx = 
(dx +/dx)/2. Then a = fw x is primitive in O, and one easily sees that 
its norm N(q) is not a perfect square (see Exercise 11.20). 

Let L denote the ring class field of O. In order to prove L = K(j(a)), 
we will study how integer primes decompose in L and K(j(a)). We will 
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make extensive use of the results of §8, especially Proposition 8.20. As 
usual, f and D will denote the conductor and discriminant of O. 

Let’s first study how integer primes behave in the ring class field L. Let 
Szi/@ be the set of primes that split completely in L. We claim that 


(11.28) Si/q = {p prime: p = N(a) for some a € O}. 


(As noted above, this shows that there are a’s in O with N(q) prime.) 
When D =0 mod 4, then O = Z[,/—n] for some positive integer n. Thus 
N(a) = N(x + yV—n) = x? + ny”, so that (11.28) says, with finitely many 
exceptions, that the primes splitting completely in L are those represented 
by x? + ny*. This was proved in Theorem 9.4. The case when D = 1 mod 4 
is similar and was covered in Exercise 9.3. This proves (11.28). 

Let M = K(j(a)). Since L is Galois over Q by Lemma 9.3, part (i) of 
Proposition 8.20 shows that M Cc L is equivalent to 


(11.29) Si QC Sue 


Take p € S_/q@, and assume that p is unramified in M (this excludes only 
finitely many p’s). By (11.28), p = N(qa) for some a€ O. Then aac aisa 
sublattice of index N(a) = p, and is cyclic since p is prime. Thus 


0= &p(j(aa), j(4)) = Pp (4), J(4)). 


Using the Kronecker’s congruence from part (v) of Theorem 11.18, this 
implies that 


0 = 4,(j(a), j(a)) = —Ci(ay? — j(a))’ + pA 


for some 6 € Oy . Now let $8 be any prime of M containing p. The above 
equation then implies that 


(11.30) j(a)? = j(a) mod f. 


We claim the following: 
(i) Ox[j(a@)] C Om has finite index. 
(ii) If pf [Om: Ox[j(a)]], then (11.30) implies that a? = a mod $f for all 
Q@eE Om : 

The proof of (i) is a direct consequence of M = K(j(a)) and is left to the 
reader (see Exercise 11.21). As for (ii), note that p splits completely in 
L, so that it splits completely in K, and hence pe pc $ for some ideal 
p of norm p. This implies that a? = a mod $ holds for all a € Ox, and 
consequently the congruence holds for all a € Ox[j(a)] by (11.30). Then 
(ii) follows easily (see Exercise 11.21). 

From (ii) it follows that fg), = 1, and since this holds for any 8B con- 
taining p, we see that p splits completely in M. This proves (11.29), and 
M CL follows. 
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The inclusion M = K(j(a)) C L shows that the ring class field L con- 
tains the j-invariants of all proper fractional O-ideals. Let h = h(O), and 
let aj, i =1,...,4 be class representatives for C(O). It follows that any j(a) 
equals one of j(a,),.--,/(a,), and furthermore j(a,),...,j(a,) are distinct. 
Thus 


(11.31) A = ][Gi@) - i(@,)) 


i<j 


is a nonzero element of O;. 

To prove the opposite inclusion L Cc M, we will use the criterion SuQ e 
Sz /x from part (ii) of Proposition 8.20. So let p € Sy /@, which means that 
p is unramified in M and fx), = 1 for some prime P of M containing p. 
In particular, this implies that p splits completely in AK, and thus p = N(p) 
for some prime ideal of O. Then Proposition 7.20 tells us that p = N(pn 
©) (we can assume that p doesn’t divide f—this excludes finitely many 
primes). If we can show that pN OQ is a principal ideal aO, then p = N(a) 
implies that p € S;/@ by (11.28). We may assume that p is relatively prime 
to the element A of (11.31). 

Let a’ =(pNO)a. Since pNO has norm p, a’ Ca is a sublattice of 
index p by Lemma 11.26, and it is cyclic since p is prime. Thus ®,(j(a’), 
j(a)) = 0. Using Kronecker’s congruence again, we can write this as 


0= &(j(a"),j(4)) = iC")? — F(a‘) — J(4)?) + PQ"), J(9)) 


for some polynomial Q(X,Y)€ Z[X,Y]. Let be a prime of L contain- 
ing 8. Since j(a’) and j(a) are algebraic integers lying in L, the above 
equation implies that pQ(j(a’), j(a)) € $B. Thus 


(11.32) j(a’)P=j(a)mod B = or ~— f(a’) = f(a)? mod BP. 


However, we also know fx), = 1, which tells us that j(a)? = j(a) mod #, 
and since % C $8, we obtain 


(11.33) j(a)? = j(a) mod . 
It is straightforward to show that (11.32) and (11.33) imply 
j(a) = j(a’') mod §. 


If a and a’ lay in distinct ideal classes in C(O), then j(a) — j(a’) would be 
one of the factors of A from (11.31), and p and A would not be relatively 
prime. This contradicts our choice of p, so that a and a’ =(pNQO)a must 
lie in the same ideal class in C(Q). This forces pf © to be a principal ideal, 
which as we showed above, implies that p€ Sz/@. Thus SM /Q G Sie: 
which completes the proof that L = M. Theorem 11.1 is proved. Q.E.D. 
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As an application of Theorem 11.1, let’s see what it tells us about the 
Abelian extensions of an imaginary quadratic field K. First, we know that 
the Hilbert class field of K is the ring class field of the maximal order Ox. 
Thus we get the following corollary of Theorem 11.1: 


Corollary 11.34. Jf K is an imaginary quadratic field, then K(j(Ox)) ts the 
Hilbert class field of K. Q.E.D. 


Besides the Hilbert class field, Theorem 11.1 also allows us to describe 
other Abelian extensions of K . Recall that in Theorem 9.18 we proved that 
an Abelian extension of K is generalized dihedral over Q if and only if it 
lies in some ring class field of K. Combining this with Theorem 11.1, we 
get the following theorem: 


Corollary 11.35. Let K be an imaginary quadratic field, and let K Cc L be 
a finite extension. Then L is an Abelian extension of K which is gener- 
alized dihedral over Q if and only if there is an order O in K such that 
Lc K(j(O)). Q.E.D. 


To complete our discussion of ring class fields and complex multipli- 
cation, we need to compute the Artin map of a ring class field using j- 
invariants. The answer is given by the following theorem: 


Theorem 11.36. Let O be an order in an imaginary quadratic field K, and 
let L be the ring class field of O. If a is a proper fractional O-ideal and p is 
a prime ideal of Ox, then 


(=X )G@ = 1@7Oe), 


Proof. For analytic proofs, see Deuring [24, §15], Lang [73, Chapter 12, §3] 
or Cohn [21, §11.2], while algebraic proofs (which use the reduction theory 
of elliptic curves) may be found in Lang [73, Chapter 10, §3] or Shimura 
[90, §5.4]. We will use this theorem (in the guise of Corollary 11.37 below) 
in §12 when we compute some j-invariants, though our discussion of the 
class equation in §13 will use only Theorem 11.1. Q.E.D. 


In terms of the ideal class group, Theorem 11.36 can be stated as follows: 


Corollary 11.37. Let O be an order in an imaginary quadratic field K, and 
let L be the ring class field of O. Given proper fractional O-ideals a and 6, 
define o4(j(6)) by 

Fa(j(b)) = j(ab). 
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Then 0, is a well-defined element of Gal(L/K), and a+ dq induces an iso- 
morphism 
C(O) — Gal(L/K). 


Proof. This is a straightforward consequence of Theorem 11.36 and the 
isomorphisms 


C(O) = 1; f)/P(O.f) = Ik f)/PK2f): 


where f is the conductor of O. See Exercise 11.22 for the details. Q.E.D. 


The “First Main Theorem” of complex multiplication allowed us to de- 
scribe some of the Abelian extensions of K, namely those which are gen- 
eralized dihedral over Q. The “Second Main Theorem” of complex multi- 
plication answers the question of how to describe all Abelian extensions of 
K. By class field theory, every Abelian extension lies in a ray class field 
for some modulus m of K, so that we need only find generators for the ray 
class fields of K. Rather than work with an arbitrary modulus m, we will 
describe the ray class fields only for moduli of the form NOx, where WN is 
a positive integer. It is easy to see that any Abelian extension of K lies in 
such a ray class field (see Exercise 11.23). 

The basic idea is that the ray class field of NOx is obtained by adjoining, 
first, the j-invariant j(L) of some lattice L, and second, some values of the 
Weierstrass g-function evaluated at N-division points of the lattice L, i.e., 
if L =[a, 8], then we use 


mat np. 


(11.38) ¢ (mee L) 


for suitable m and n. The observation that (11.38) generates Abelian ex- 
tensions of K goes back to Abel. The problem is that these values aren’t 
invariant enough: if we multiply the lattice by a constant, the j-invariant re- 
mains the same, but the values (11.38) change. To remedy this problem, we 
introduce a variant of the Weierstrass ¢-function called the Weber function. 
Given the lattice L, the Weber function T(z; L) is defined by 


2 
nm e(z; L? if g3(L) =0 
1(z;L) = rented Ly if go(L) = 0 
82(L)g3(L) 


A(L) (z; L) otherwise, 


where A(L) = g2(L)> — 27g3(L)*. It is easy to check that T(Az;AL) = 
T(z; L) for all X € C* (see Exercise 11.24). 
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We can now state the “Second Main Theorem” of class field theory, 
which uses singular j-invariants and the Weber function to generate ray 
class fields: 


Theorem 11.39, Let K be an imaginary quadratic field of discriminant dx, 

and let N be a positive integer. 

(i) K(j(Ox),TA/N; Ox)) is the ray class field for the modulus NOx. 

(ii) Let O be the order of conductor N in K. Then K(j(O),T(wx;0)), 
where wx = (dx + Vdx)/2, is the ray class field for the modulus NOx. 


Proof. Notice that in each case we obtain the ray class field by adjoining 
the j-invariant of a lattice and the Weber function of one N-division point. 
The proof of (i) may be found in Deuring [24, §26] or Lang [73, §10.3, 
Corollary to Theorem 7], and the proof of (ii) follows from Satz 1 of Franz 
[37]. These references also explain how to generate the ray class field of an 
arbitrary modulus m of K. Q.E.D. 


The theory of complex multiplication, even in the one variable case de- 
scribed here, is still an active area of research. See, for example, the books 
Elliptic Functions and Rings of Integers [15] by Cassou-Nogués and Taylor 
and Arithmetic on Elliptic Curves with Complex Multiplication [45] by Gross. 


E. Exercises 


11.1. This exercise will study j-invariants and complex conjugation. 

(a) Let L be a lattice, and let L denote the lattice obtained by com- 
plex conjugation. Prove that go(L) = g2(L), g3(L) = g3(L) and 
J(L) = j(Z). 

(b) Let a be a proper fractional O-ideal, where © is an order in 
an imaginary quadratic field. Show that j(a) is a real number if 
and only if the class of a has order < 2 in the ideal class group 
C(O). Hint: use (a) and Theorem 10.9. 


One consequence of (b) is that j(O) is real for any order O. 
11.2. If eh and 7 = (2°) €SL(2,Z), then show that 
_at+b 
cT+d 
also lies in h. This shows that SL(2,Z) acts on . Hint: use (7.9). 


11.3. Let 7 satisfy |Re(7)| < 1/2 and |Im(7)| > €, where € < 1 is fixed. Our 
goal is to show that for x,y ER, 


|x+yT|> = V x2 + y?. 


11.4. 
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If we let 7 = a + bi, then the above is equivalent to 
a 
(x tay) + b’y? > race + y*). 


(a) Show that the inequality is true when |x + ay| > (€/2)|x|. 

(b) When |x + ay| < (€/2)|x|, use ja] < 1/2 and € <1 to show that 
Ix] <lyl. 

(c) Using (b), show that the inequality is true when |x +ay|< 


(€/2)|x]. 


In Lemma 11.4 we showed that every point of h is SL(2, Z)-equiva- 
lent to a point in the region {7 € §: |Re(7)| < 1/2,|Im(7)| > 1/2}. 
In this exercise we will study the smaller region 


F = {7 €§: |Re(7)| < 1/2, |r| > 1, and 
Re(r) > 0 if |Re(7)| = 1/2 or |7| = 1}, 


and we will show that every point of § is SL(2,Z)-equivalent to a 
unique point of F. This is usually expressed by saying that F is a 
fundamental domain for the action of SL(2,Z) on h. Our basic tool 
will be positive definite quadratic forms f(x,y) = ax* + bxy + cy?, 
where we allow a, b and c to be real numbers. We say that two 
such forms f(x,y) and g(x,y) are R*-equivalent if there is (? 7) € 
SL(2, Z) such that 


f(%,y) = Ag(px t+ qy,rx+sy) 


for some A > 0 in R. Finally, we say that f(x,y) =ax* + bxy +cy? 
is reduced if 


a<|b|<c, and b>0 if a= |b| or |b] =c. 


This is consistent with the definition given in §2. 


(a) Show that R*-equivalence of positive definite forms is an equiv- 
alence reljation. 


(b) Show that every positive definite form is R*-equivalent to a re- 
duced form, and that two reduced forms are R*-equivalent if 
and only if one is a constant multiple of the other. Hint: see the 
proof of Theorem 2.8. 


(c) Show that every positive definite form f(x,y) = ax? +bxy + 
cy” can be written uniquely as f(x,y) = a|x —Ty|?, where 7 € 
h. In this case we say that 7 is the root of f(x,y) (this is con- 
sistent with the terminology used in §7). Furthermore, show that 
b = 2aRe(r) and c = alr]. 
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(d) Show that two positive definite forms are R*-equivalent if and 
only if their roots are SL(2, Z)-equivalent. Hint: see the proof of 
(7.8). 

(e) Show that a positive definite form is reduced if and only if its 
root lies in the fundamental domain F. 

(f) Conclude that every 7€ 5 is SL(2,Z)-equivalent to a unique 
point of F. 

This exercise shows that there is a remarkable relation between re- 

duced forms and fundamental domains. Similar considerations led 

Gauss (unpublished, of course) to discover the idea of a fundamen- 

tal domain in the early 1800s. See Cox [23] for more details. 


In this exercise we will prove Lemma 11.5 and Corollary 11.6. 
(a) Let M and « be positive constants, and define K C } by 


K = {r€§: |Re(7)| < M, € < |Im(7)| < 1/e}. 


We want to show that the set A(K) = {7 € SL(2,Z): 7(K)NK # 

0} is finite. So take y = (4°) € A(K), which means that there 

is T€ K such that y7 € K. If we can bound |a|, |b], |c| and |d| 

in terms of M and e, then finiteness will follow. 

(i) Use (7.9) to show that |cr + d| < 1/e. 

(ii) Since |ct + d|*? = (cRe(r) + d)* + c?Im(r)’, conclude that 
|c| < 1/e* and |d| < (e+ M)/e’*. 

(iii) Show that y~! € A(K). By (ii), this implies that |a| < (€+ 
M)/eé. 

(iv) Show that |b] < [ct + d||y7| + |a||7|, and conclude that |)| 
is bounded in terms of M and e. 

(b) Use (a) to show that if U is a neighborhood of 7 € 5 such that 
U ch is compact, then {y € SL(2,Z): y(U)NU # 0} is finite. 
This will prove Lemma 11.5. 

(c) Prove Corollary 11.6. 


This exercise is concerned with the proof of part (iv) of Theorem 

t.2. 

(a) Suppose that y = (2°) € SL(2,Z) and that yr = 7 for some 7 € 
h. We saw in the text that this implies [1,7] = (c7 + d@)[1,7T]. 
Prove that c # 0. Hint: show that c = 0 implies y = +(, 7). But 
such a y with m # 0 has no fixed points on 6. 

(b) Let © be an order in an imaginary quadratic field such that 
O* # {+1}. Prove that O = Ox for K = Q(/) or Q(w), w= 
e*™/3 Hint: when O = Ox, see Exercise 5.9. See also Lemma 
12. 
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(c) Show that the only elements of SL(2,Z) fixing i are +($)) and 
+(_°)). Hint: use (a). 

(d) If w = e?™/3, show that j’(w) = j"(w) = 0 but j’'"(w) £0. 

Let f(7) be a modular function for SL(2,Z), and assume that f(T) 

has a pole of order m at T =w,w = e?™/3, 

(a) Prove that m is divisible by 3. Hint: argue as in the case when 
f(T) has a pole at 7 =7. Note that w is fixed by oe ree 

(b) Prove that j(7)"/?f(7) is holomorphic at w. Hint: use part (iv) 
of Theorem 11.2. 


As in the proof of Theorem 11.9, let 
a b 
rac) = | ( ) € SL(2,Z):c = 0 mod m} 
C.d 
a b 
com = 4 (6 5) ad =m, a>0, 05 b <a, geda.b,d) = 1}. 


and let 79 = (7°) € C(m). 

(a) Show that Ip(m) = (0) 'SL(2,Z)o0) NSL(2,Z). 

(b) If ¢ € C(m), then show that (0) 'SL(2,Z)o)NSL(2,Z) is a co- 
set of [g(m) in SL(2,Z). 

(c) In the construction of part (b), show that different o’s give dif- 
ferent cosets, and that all cosets of Ig(m) in SL(2,Z) arise in this 
way. 


Let m be a positive integer, and let f(m) denote the number of 
triples (a,b,d) of integers which satisfy ad=m,a>0,0<b<d 
and gcd(a,b,d) = 1. Thus f(m) = |C(m)!, where C(m) is the set of 
matrices defined in the the previous exercise. The goal of this exer- 
cise 1S to prove that 


= a 
f(m= mI (1+ 5). 


(a) If we fix a positive divisor d of m, then a = m/d is determined. 
Show that the number of possible b’s for this d is given by 


d 
peda may m/d)), 


where @ denotes the Euler ¢-function. 


246 


11.10. 


11.11. 


11.12. 


§11. MODULAR FUNCTIONS AND RING CLASS FIELDS 


(b) Use the formula of (a) to prove that f(m) is multiplicative, 
l.e., that if m, and my are relatively prime, then f(m m2) = 
f(mi)f (m2). 


(c) Use the formula of (a) to prove that if p is a prime, then 
f(p") = p" + po. 
(d) Use (b) and (c) to prove the desired formula for f(m). 


In this exercise we will show that ®,,.(.X,Y) is irreducible as a poly- 
nomial in X (which will prove part (1) of Theorem 11.18). Let 7, 
be coset representatives for [9(m) in SL(2, Z). As we saw in (11.14), 
we can write 


|C(m)| 
Om(X,j(7)) = T] (x -s(mn7)). 
1=1 


Let F,, be the field C(j(7),j(m7)). Since @,,(X,j(7)) has coeffi- 
cients in C(j(7)) and j(mr) is a root, it follows that [F,,:C((7))] 
< U(m). If we can prove equality, then ®,.(.X,/(7)) will be the 
minimal polynomial of j(m7) over C(j(7)), and irreducibility will 
follow. 

(a) Let F be the field of all meromorphic functions on 6, which 
contains F,, as a subfield. For y € SL(2,Z), show that f(T) > 
f(T) is an embedding of F,, into F which is the identity on 
C((7)). 

(b) Use (11.13) to show that j(my7,7) # j(my,T) fori # 7. The em- 
beddings constructed in (a) are thus distinct, which shows that 
[Fim : C(U(7T))] = Y(m). This proves the desired equality. 


Show that the coefficients of G(.X,7) (as defined in (11.16)) are 
modular functions for SL(2,Z). Hint: argue as in the case of the 
modular function. You will use the fact that f(7,7) has a meromor- 
phic q-expansion. 


Let A CC be an additive subgroup, and let f(7) be a holomorphic 
modular function. Suppose that its q-expansion is 


co 


f= >> ang", 


n=—-M 


and that a, € A for all n <0. Then prove the Hasse q-expansion 
principle, which states that f(7) is a polynomial in j(7) with coef- 
ficients in A. Hint? since the qg-expansion of j(7) has integer coef- 
ficients and begins with 1/q, the polynomial A(x) used in part (11) 
of Lemma 11.10 must have coefficients in A. 


11.13. 


11.14. 
11.15. 


11.16. 


11.17. 
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Let p be a prime, and let ¢, = e?"/?. 
(a) Prove that p = (1—¢,)(1—¢5)::-(1— ¢2-"). Hint: factor x?~1 


+te-e-+X4+1. 
(b) Given a € Z[¢,], define the norm Nec,)/@(@) to be the num- 
ber 
Neg ye@ = [I co). 


7 €Gal(Q(¢,)/Q) 


For simplicity, we will write N(a@) instead of Necc,)/@(@). 
Then prove that N(q) is an integer, and show that N(af) = 
N(q@)N(G) and N(1—-¢,) = p. 

(c) If an integer a can be written a = (1—¢,)a@ where ae Z[¢,], 
then use (b) to prove that a is divisible by p. 


Adapt the proof of (11.21) to show that j(7) = j(o0T)? mod p. 


Let f(X,Y)€Z[X,Y] be a polynomial such that f(X,j(7T))€ 
PZ((q))[X]. Prove that f(X,Y)e€ pZ[X,Y]. Hint: apply the 
q-expansion principle (Exercise 11.12) to the coefficients of X. 


Let M =Z’, and let A be a 2 x 2 integer matrix with det(A) # 0. 
We know by Exercise 7.15 that M/AM is a finite group of order 
|det(A)|. The object of this exercise is to prove that M/AM is 
cyclic if and only if the entries of A are relatively prime. 

(a) Let G be a finite Abelian group. Prove that G is not cyclic if 
and only if G contains a subgroup isomorphic to (Z/dZ)* for 
some integer d > 1. Hint: use the structure theorem for finite 
Abelian groups. 

(b) Assume that the entries of A have a common divisor d > 1, 
and prove that M/AM is not cyclic. Hint: write A = dA’, 
where A’ is an integer matrix, and note that A’M/dA'M C 
M/AM. Then use (a). 

(c) Finally, assume that M/AM is not cyclic, and prove that that 
the entries of A have a common divisor d > 1. Hint: by (a), 
there is AM C M'CM such that M'/ AM ~ (Z/dZ)* for some 
d> 1. Prove that AM = dM", and conclude that d divides the 
entries of A. 


This exercise is concerned with the proof of Lemma 11.24. 

(a) Let L’ be a sublattice of [1,7] of finite index, and Jet d be the 
smallest positive integer in L'. Then prove that L’ = [d,at + b] 
for some integers a and b. 

(b) Let 7 € 5, and let C(m) be the set of matrices defined in the 
text. If 0,0’ € C(m) and d[1,o7] = d'[1,o'T], then prove that 
g=o'. 
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11.23. 


11.24. 


§ 12. MODULAR FUNCTIONS AND SINGULAR j-INVARIANTS 


In the text, we proved that for 7 € 5, the roots of ®(X, j(7)) = 0 
are the j-invariants of the cyclic sublattices of index m of [1,7]. Use 
this fact and the surjectivity of the j-function to prove Theorem 
11.23. 


Let O be an order, and Jet 6 be a proper fractional O-ideal. If a 
is a proper O-ideal which is not primitive, then prove that b/ab is 
not cyclic. Hint: use part (a) of Exercise 11.16. 


Let O be an order in an imaginary quadratic field K of conductor 
f . Letting we = (dx + V/dx)/2, we proved in Lemma 7.2 that O = 
[1,fwx]. Prove that a = fwx is a primitive element of O whose 
norm is not a perfect square. 


Let K Cc L be an extension of number fields, and let a € Oy satisfy 

L=K(a@). 

(a) Prove that Ox[a] has finite index in O,. Hint: By Theorem 
5.3, we know that O, is a free Z-module of rank [L: Q]. Then 
show that Ox[a] has the same rank. 

(b) Let $$ be a prime ideal of Oz, and suppose that N() = p/, 
where p is relatively prime to [O,: Ox[a]]. If 6B? = 6B mod P 
holds for all 6 € Ox[a], then show that the same congruence 
holds for all 6 € O,. Hint: if N = [O,: Ox[a]], then multipli- 
cation by N induces an isomorphism of O,/. 


Complete the proof of Corollary 11.37. 


Let K be an imaginary quadratic field, and let L be an Abelian 
extension of K . Prove that there is a positive integer N such that 
L is contained in the ray class field for the modulus NOx. 


If L is a lattice and T(z; L) is the Weber function defined in the 
text, then prove that 7(Az; AL) = T(z; L) for any A € C”. 
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The j-invariant j(L) of a lattice with complex multiplication is often called 
a singular j-invariant or a singular modulus. In §11 we lJearned about the 
fields generated by singular moduli, and in this section we will compute 
some of these remarkable numbers. One of our main tools wil] be the func- 
tion 72(7), which is defined by 


12(T) = Vi(7). 


We will show that 72(37) is a modular function for Ip(9), and we will use 
42(T) to generate ring class fields for orders of discriminant not divisible 
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by 3. This will explain why the j-invariants computed in §10 were perfect 
cubes. 

We will then give a modern treatment of some of the results contained 
in Volume III of Weber’s monumental Lehrbuch der Algebra [102]. There 
is a wealth of material in this book, far more than we could ever cover 
here. We will concentrate on some applications of the Dedekind 7)-function 
7(T) and the three Weber functions f(7), f,(7) and f,(7). These functions 
are closely related to 72(7) and j(7) and make it easy to compute the j- 
Invariants of most orders of class number 1. The Weber functions also give 
some interesting modular functions, which will enable us to compute that 


(12.1) 
3 
i0/=14) = 2 (s23 + 228V2 + (231+ 161V3)y 2v2-1) | 


At the end of the section, we will present Heegner’s proof of the Baker- 
Heegner-Stark theorem on imaginary quadratic fields of class number 1. 


A. The Cube Root of the /-Function 


Our first task is to study the cube root 72(7) of the j-function. Recall from 
§11 that j(7) can be written as the quotient 


g2(Ty 
A(T) 


j(r) = 1728 


The function A(7) is nonvanishing and holomorphic on the simply con- 
nected domain , and hence has a holomorphic cube root y/A(7). Since 
A(T) is real-valued on the imaginary axis (see Exercise 12.1), we can choose 
\/ A(T) with the same property. Using this cube root, we define 


§2(T) | 
VA(T) 


Since g2(7) is also real on the imaginary axis (see Exercise 12.1), it fol- 
lows that y2(7) is the unique cube root of j(7) which is real-valued on the 
imaginary axis. 

For us, the main property of y2(7) is that it can be used to generate all 
ring class fields of orders of discriminant not divisible by 3. Note that 7 
needs to be chosen carefully, for replacing 7 by 7 + 1 doesn't affect j(7), 
but we will see below that y2(7 + 1) = ¢y7'%2(7T), where ¢, = e?"'/3. The 
necessity to normalize 7 leads to the following theorem: 


WT) = 1? 


Theorem 12.2. Let © be an order of discriminant D in an imaginary qua- 
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dratic field K. Assume that 3/ D, and write O = [1,7], where 
V-m, D =—4m=0 mod 4 


TQ Se ee 
+ D=-—-m=1mod 4. 


b 


Then 72(7) is an algebraic integer and K(72(7)) ts the ring class field of O. 
Furthermore, Q(72(T0)) = QU(70)). 


Let’s first see how this theorem relates to the j-invariants computed in 
810. When © has class number one, we know that /(Q) is an integer, so 
that by Theorem 12.2, y2(7) Is also an integer when 3/ D. This explains 
why 


jG), = 
iW: = 20 


(CH) 


= _153 
5 1 


are all perfect cubes. (In the last case, note that j((1 + V—7)/2) = J((3 + 
V—7)/2), so that Theorem 12.2 does apply.) 


Proof of Theorem 12.2. By Theorem 11.1, we know that K(j(79)) is the ring 
class field of O = [1,70]. Thus, to prove Theorem 12.2, it suffices to prove 
that 

Q(72(70)) = QU (7) 
whenever 3/ D. The first proof of this theorem was due to Weber [102, 
§125], and modern proofs have been given by Birch [7] and Schertz [87]. 


Our presentation 1s based on [87]. 
The first step of the proof is to show that y2(37) 1s a modular function. 


Proposition 12.3. 72(37) 1s a modular function for the group T(9). 

Proof. We first study how 72(7T) transforms under elements of SL(2,Z). We 
claim that 

y2(—-1/T) = 72(7) 

(t+ 1) = Gy '72(7). 


where ¢, = e°"'/3. The first line of (12.4) is easy to prove, for y2(—1/T) is 
a cube root of j(—1/7T) = J(7). But —1/7 lies on the imaginary axis when- 
ever T does. so that y2(—1/7) is a cube root of j(7) which is real on the 
Imaginary axis. By the definition of y2(7). this implies 72(—1/7T) = 72(T). 


(12.4) 
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To prove the second line of (12.4), consider the g-expansion of 72(T). 
We know that 


I(T) = 47! + So eng” = q7'h(q), 
n=0 
where /(q) is holomorphic for |q| < 1 and 4(0) = 1. We can therefore write 
h(q) = u(q)*, where u(q) is holomorphic and u(0) = 1. Note also that u(q) 
has rational coefficients since 4(q) does (see Exercise 12.2). Then q~!/?u(q) 
is a cube root of j(7) which is real-valued on the imaginary axis, and it fol- 
lows that 


(125) w(t) = q7'Pu(qgy= q+ > bag”), bn € Q. 
n=0 


It is now trivial to see that y2(7 + 1) = 6 92(7), and (12.4) is proved. 
We next claim that if (4°) €SL(2,Z), then 


aT+b —_ oe ae 
(126) ma (SEZ) = cgertbretedetancry 


To see this, first note that (12.6) holds for S = (74) and T= (41) by 
(12.4). It is well-known that these two matrices generate SL(2,Z) (see Serre 
(88, §VII.1] or Exercise 12.3). Then (12.6) follows by induction on the 
length of (? 2 as a word in S and T (see Exercise 12.5). 

Given (12.6), it follows easily that 72(7) is invariant under the group of 


matrices 
: a b 
ra=4( ) :b=e=0mod 3}. 
Cc. «a 


This group is related to [9(9) by the identity 


Ty(9) = ee ) F3) @ oF 


and a simple computation then shows that y2(37) is invariant under I9(9) 
(see Exercise 12.5). The group ['(3) is not the largest subgroup of SL(2,Z) 
fixing y2(7), but it’s the one that relates most easily to the Io(m)’s (see 
Exercise 12.5). 

To finish the proof that 72(37) 1s a modular function for I9(9), we need 
to check its behavior at the cusps. Let y € SL(2,Z). By Theorem 11.9, j(37) 
is a modular function for [p(3), so that j(3y7) has a meromorphic expan- 
sion in powers of q!/?. Taking cube roots, this implies that 72(3yrT) has 
a meromorphic expansion in powers of q'/?, which proves that 72(37) is 
meromorphic at the cusps. This proves the proposition. Q.E.D. 
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Once we know that 72(37) is a modular function for Ip(9), Theorem 
11.9 tells us that it is a rational function in j(7) and j(97). The following 
proposition will give us information about the coefficients of this rational 
function: 


Proposition 12.7. Let f(7) be a modular function for To(m). 

(i) If the q-expansion of f(T) has rational coefficients, then f(T) € Q(j(7), 
j(mr)). 

(ii) Assume in addition f(T) is holomorphic on 6, and let 7 € 6. If 


Se (j(mro),j(70)) #0, 


then f (7%) € QU(7), J(™70))- 


Remark. Note that the hypothesis of (1) involves only the expansion of f(7) 
in powers of q!/™. For general y € SL(2,Z), the expansion of f(77) need 
not have coefficients in Q. 


Proof. To prove (1), we will use the representation 
G(i(MT), I(T) 


(12.8) inN=3 
ay tment). J(7)) 


given by (11.17). Since the denominator clearly lies in Q(j(T),j(m7T)) (part 
(1) of Theorem 11.18), it suffices to show that the same holds for the nu- 
merator. We know that G(j(mr), j(7)) lies in C(j(7))[J(mr7)], so that 


PUNT), I(T) 
QOMi(T)) 


where P(X,Y) and Q(Y) #0 are polynomials with complex coefficients. 
Let’s write these polynomials as 


N M 
PAYS) ayy 


1=0 k=0 


G(J(MT), (7) = 


L 
Q(Y)=S by’. 


1=0 
Then (12.8) implies that 


, , OPm ,. 
PU (mr). AT)) = F Max Ulm), OU), 
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which we can write as 


N M 
SS ani(mry int = £7) 


1=0 k=0 


Oo = 

Se (mr). i) (>: itor , 
Substituting in the q-expansions of f(T), j(7) and j(m7) and equating co- 
efficients of powers of q'/”, we get an infinite system of homogeneous 
linear equations with the a,,’s and 6;’s as unknowns. The q-expansions of 
f(r), j(7) and j(mr) all have coefficients in Q, and the coefficients of 
(0/0X)®,,(X,Y) are also rational. Thus the coefficients of our system of 
equations all lie in Q. This system has a solution over C which is nontrivial 
in the b;’s (since Q(j(T)) # 0), and hence must have a solution over Q also 
nontrivial in the b;’s. This proves that P(X,Y) and Q(Y) # 0 can be chosen 
to have rational coefficients, which proves part (i). 

To prove (ii), let’s go back to the definition of G(X,j(7)) given in 
(11.16). Since f(7) is holomorphic on 5, the coefficients of G(X, j(7)) are 
also holomorphic on . As we saw in Lemma 11.10, this means that the 
coefficients are polynomials in j(7). Thus, in the representation of f(T) 
given by (12.8), the numerator G(j(m7), j(7)) is a polynomial in j(m7) and 
j(7). By a slight modification of the argument for part (i), we can assume 
that it has rational coefficients (see Exercise 12.6). Consequently, whenever 
the denominator doesn’t vanish at 7, we can evaluate this expression at 
T = 7 to conclude that f(79) lies in Q(j(7), j(m7)). Q.E.D. 


We want to apply this proposition to y2(7), where 7 is given in the 
statement of Theorem 12.2. By (12.5), we see that the g-expansion of ¥2(37) 
has rational coefficients. Since it is a modular function for I(9), Proposi- 


tion 12.7 tells us that 
72(37) € Q(/(7), (97). 


Since we’re concerned about 72(7), we need to evaluate the above expres- 
sion at T = 79/3. We will for the moment assume that 


(129) AS? Ci(3r) j(10/3)) # 0. 


Since 72(37T) is holomorphic, the second part of Proposition 12.7 then im- 
plies that 72(70) € Q@(/(70/3), j/(370)), which we can write as 


(12.10) y2(7) € QG([1, 70/3]); ([1,370])). 


To see what this says about y2(7), recall that O = [1,7]. Then O' = 
[1,379] 1s the order of index 3 in ©, and the special form of 79 implies 
that [1,7/3] is a proper fractional O'-ideal (this follows from Lemma 7.5 
and 3/ D—see Exercise 12.7). Thus, by Theorem 11.1, both j(79/3) and 
J(370) generate the ring class field L’ of the order O’. Consequently, (12.10) 
implies that y2(7) lies in the ring class field L’. 
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Let L denote the ring class field of O, so that L Cc L'. To compute the 
degree of this extension, recall that the class number is the degree of the 
ring class field over K . Since the discriminant of © is D, this means that 
[L’: L] = h(9D)/h(D). Corollary 7.28 implies that 


w= SB (-(8)}) 


and since 3/ D, we see that L C L’ is an extension of degree 2 or 4. 
Now consider the following diagram of fields: 


QU(m)) Cc L 
n n 


Q(72(7%)) Cc L! 


We know that L has degree 2 over Q(j(7)), and by the above computation, 
L' has degree 2 or 4 over L. It follows that the degree of Q(72(7)) over 
Q(j(7)) is a power of 2. But recall that 72(7) is the rea] cube root of 
J(7]), which means that the extension Q(j(7)) C Q(72(70)) has degree 1 or 
3. Hence this degree must be 1, which proves that Q(j(7)) = Q(72(7)).- 

We are not quite done with the theorem, for we still have to verify that 
(12.9) is satisfied, i.e., that 


oe? (i(3m),i(t0/3)) #0. 
For later purposes, we will prove the following general Jemma: 
Lemma 12.11. Let O be an order in an imaginary quadratic field, and as- 
sume that O* = {+1}. Write O = [1,a], and assume that for some integer s, 


s |T(a) and gcd(s*, N(a)) is squarefree, where T(a) and N(q@) are the trace 
and norm of a. Then for any positive integer m, 


S22" (j(ma/s),j(a/s)) #0. 


Proof. Since ®,(j(ma/s), j(a/s)) = 0, the nonvanishing of the partial de- 
rivative means that j(ma/s) is not a multiple root of the polynomial 


®n(X,j(a/s))= [] (X-i@a/s)). 


aEC(m) 


Thus we must show that 


m 0 
jJ(ma/s) # j(ga/s), agEC(m), 0 #0)= ( 0 :) 
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So pick ¢ = (4°) € C(m), o # a0, and assume that j(ma/s) = j(ca/s). 
In terms of lattices, this means that there is a complex number A such 
that 


(12.12) A[1, ma/s] = [d,aa/s + b}. 


We will show that this leads to a contradiction when O* = {+1}. 

The idea is to prove that A is a unit of O. To see this, note that by 
Lemma 11.24, both [1,ma/s] and [d,aa/s + b] have index m in [1,a/s], so 
that A must have norm 1. Furthermore, we have 


sr € s[d,aa/s + b] = [sd,aa+ sb] C [s,a]. 
Writing sA = us + va, u,v € Z, and taking norms, we obtain 
s? =5*N(A) = N(us + va) = u’s* + usvT(a) + v*N(aQ). 


Since s | T(q), it follows that s?|v?N(a), and since gcd(s?, N(@)) is square- 
free, we must have s|v. This shows that A €[l,a]= 0, so that A is a 
unit since it has norm 1. Then O* = {+1} implies that \ = +1, and hence 
[1, ma/s]=[d,aa/s + b], which contradicts o # oo by the uniqueness part 
of Lemma 11.24. The lemma is proved. Q.E.D. 


We want to apply this lemma to the case s =3, m=9 and a=7. 
Using the special form of 79, it is easy to see that the norm and trace 
conditions are satisfied (note that the discriminant of O = [1,79] is D = 
T(T)* — 4N(10)). Thus (12.9) holds except possibly when © is Z[i] or Z[¢, ]. 
The latter can’t occur since 3 doesn’t divide the discriminant, and when 
O = Z{t], a simple argument shows that (12.12) is impossible (see Exercise 
12.8). This completes the proof of Theorem 12.2. Q.E.D. 


This theorem tells us about the behavior of 72(70) when 3 doesn’t divide 
the discriminant D. For completeness, let’s record what happens when D is 
a multiple of 3 (see Schertz [87] for a proof): 


Theorem 12.13. Let O be an order of discriminant D in an imaginary qua- 
dratic field K. Assume 3| D and D < —3, and write O = [1,70], where 
Jf/—m, D = —4m = 0 mod 4 
-tyom othe D=-—-m=1mod 4. 


Then K(y2(7)) is the ring class field of the order O' = [1,37] and is an 
extension of degree 3 of the ring class field of O. Furthermore, Q(72(70)) = 
Q.E.D. 


Q((370)). 
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B. The Weber Functions 


To work effectively with 72(7), we need good formulas for computing it. 
This leads us to our next topic, the Dedekind 7-function 7(7) and the three 
Weber functions f(T), f,(7) and f,(7). If 7 € 6, we let q = e2TT as usual, 
and then the Dedekind 7-function is defined by the formula 


n(t) = 4!" | [a- 4"). 
=1 


Note that this product converges (and is nonzero) for 7 € § since 0 < |q| 
<a 

We then define the Weber functions f(7), f,(7) and f,(7) in terms of the 
n-function as follows: 


— -1 (7 + V/2) 
1s Ce ae) 
(12.14) i\(r) = we 2 
= ynl27) 
RO = Vay 


where (4, = e2/48. From these definitions, one gets the following product 
expansions for the Weber functions: 


f(T) = o/*TTa + qr tl?) 


n=1 


(12.15) f,(T) = ge Ila _ gh—'?y 


n=1 
f(r) = V2q"/ T+ 9”) 
n=1 


(see Exercise 12.9), and we also get the following useful identities connect- 
ing the Weber functions: 


f(r)fi(7)fa(r) = V2 


(12.16) 
fi(27)f2(7) = V2 


(see Exercise 12.9). 
Much deeper lie the following relations between 7(7T), f(7), f,(7) and 
f.(7) and the previously defined functions j(7), y2(7) and A(r): 
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Theorem 12.17. If 7 € , then A(r) = (21)'*n(1)* and 


Remark. Since j(T) = 72(T)*, this theorem gives us some remarkable for- 
mulas for computing the j-function. 


Proof. We need to relate 7(7) and the Weber functions to the Weierstrass 
g-function. Let o(z) = e(z;7T) denote the g-function for the lattice [1,7], 
and set 


é:= (7/2), e2=9(1/2), 3 = @((7 + 1)/2). 


We will prove the following formulas for the differences e; — e;: 


ene, = 10)‘ f(r) 
(12.18) e2—e3 = 1'n(T)*f,(7)° 
e3—e, = 1'n(T)*fa(T)°. 


The basic strategy of the proof is to express ej — e; in terms of the Weier- 
strass o-function, and then use the product expansion of the o-function to 
get product expansions for e; — e;. Proofs will appear in the exercises. 

The Weierstrass o-function is defined as follows. Let 7 € h, and let L be 
the lattice [1,7]. Then the Weierstrass o-function is the product 


7) = _ 2) p2/wt(l/2z/wyY 
0(2;T)=Z I] (1 =} e 
weL—{0} 


Note that 0(z;7) is an odd function in z. We will usually write o(z;7) more 
simply as o(z). The o-function is not periodic, but there are complex num- 
bers 7 and 72, depending only on 7, such that 


o(z+T) = —eN@*7/Yg(z) 
a(z + 1) = —eM@*1/%G(z), 


and the numbers 7 and 72 satisfy the Legendre relation 727 — 4 = 271 (see 
Exercise 12.10). The o-function is related to the g-function by the formula 


a(z + w)o(z— w) 


9(Z) = p(w) = o2(z)a2(w) 
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whenever z and w do not lie in L (see Exercise 12.11). Since e; = (7/2), 
e2 = 9(1/2) and e3 = e((7 + 1)/2), it follows easily that 


og? ( 


(see Exercise 12.12). 
There is also the following q-product expansion for the o-function: 


i (1- G74z j= q;/4z) 
TI (1- qn) 4 


where qr = e?"" and q, =e’? (see Exercise 12.13). Using this product 
expansion, we obtain 
a(t) = Lewis falty 
2 20 n(T)? 


T 272 _qief (7)? 
—) = ent /8p-1/8 HN sf. 
o( ) oe 4 


a(Z;T)= sett (gi! — 
n=1 


2 n(ry? 
Pa lens 1 1 eM(7 +1)" /8g—1/8 IT" f(r)’ 
2) In n(T)* 


(see Exercise 12.14). It is now straightforward to derive the desired formu- 
las (12.18) for e; — e; (see Exercise 12.14). 

To relate this to A(r), recall from (10.6) that A(7) = 16(e2 — e1)*(e2 — 
e3)*(e3 — e1)*. By (12.18), it is now easy to express A(7) in terms of the 
n-function: 

A(T) = 16(e2 — e1)*(e2 — €3)"(e3 — €1)” 
= Lon n(ry*F(ryfi(ry° fo(7)"* 
= (2ny?n(r)™, 

where the last line follows by (12.16). 
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Turning to 72(7), we know that 
: 12g2(T ) 
T)= 3 TT) = ———_— 
Y2(T) = VIC /A(T) 
where the cube root is chosen to be real-valued on the imaginary axis. Using 
what we just proved about A(7), this formula can be written 
_ _382(7) 
OO ein 


since 7(T) is real valued on the imaginary axis. Thus, to express 72(7) in 
terms of Weber functions, we need to express g2(7) in terms of (7), f(T), 
fi(T) and fo(T). 

The idea is to write go(7) in terms of the e; — e;’s. Recall from the proof 
of Proposition 10.7 that the e;’s are the roots of 4x? — 82(T)x — g3(T), which 
implies that g2(T) = —4(e1e2 + e1e3 + e2e3) (see Exercise 10.8). Then, using 
€, +e2 + e3 = 0, one obtains 


3g2(T) = 4((e2 — 1)’ — (e2 — e3)(e3 — €1)) 
(see Exercise 12.15). Substituting in the formulas from (12.18) yields 
3go(T) = 4a'n(r) (F(T)? — fit) fa(7)”)s 
so that 
W(T) = f(ry"” — f1(7)* f(r)? 


16 
= f(r) ~ {(T)8 
f(r) — 16 

fre” 


where we have again used the basic identity (12.16). The other two formu- 
las for 72(7) are proved similarly and are left to the reader (see Exercise 
12.15). This completes the proof of the theorem. Q.E.D. 


Using these formulas it is easy to show that the q-expansions of 72(7) 
and j(7) have integer coefficients (see Exercise 12.16), and this proves The- 
orem 11.8. We can also use Theorem 12.17 to study the transformation 


properties of (7), f(T), f,(7) and f,(7): 
Corollary 12.19. For a positive integer n, let ¢, = e?™/". Then 


n(T + 1) = Cun(7) 
n(-1/T) = V-itn(1), 
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where the square root is chosen to be positive on the imaginary axis. Further- 
more, 


f(r + 1) = Gg fil) 
fit +1) = Cg f(r) 
fo(T + 1) = Casf2(7); 
and 
f(-1/7) = f@) 
fi(-1/7) = f2(7) 
fo(-1/T) = fi(7). 
Proof. The definition of 7(7) makes the formula for (7 + 1) obvious. Turn- 
ing to 7(—1/T), first consider A(7) = (27)'?n(7)“. For a lattice L, we 


know 
A(L) = g2(L)’ — 27g3(L)’. 


In (10.10) we showed that g2(AL) = A~‘*g2(L) and g3(AL) = A~°g3(L), 
which implies that 
A(AL) = -P ACL). 


This gives us the formula 
A(-1/T) = A((A,-1/T) = AG [1,7] = 7° A(T) = 77 A(7), 
and taking 24th roots, we obtain 
m(—-1/T) = eV—iT (7) 


for some root of unity €. Both sides take positive real values on the imag- 
inary axis, which forces € to be 1. This proves that (7) transforms as de- 
sired. 

Turning to the Weber functions, their behavior under T+ 7 + 1 and TH 
—1/T are simple consequences of their definitions and the transformation 
properties of 7(7) (see Exercise 12.17). Q.E.D. 


We will make extensive use of these transformation properties in the 
latter part of this section. 


C. j-Invariants of Orders of Class Number 1 


Using the properties of the Weber functions, we can now compute the j- 
invariants for orders of class number 1. In §7 we saw that there are exactly 
13 such orders, with discriminants 


28 2A 8 16, 219297, 298. 49. 67. 163 
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(we will prove this in Theorem 12.34 below). The j-invariants of these or- 
ders are integers, and if we restrict ourselves to those where 3 doesn’t di- 
vide the discriminant (10 of the above 13), then Theorem 12.2 tells us that 
the j-invariant is a cube. So in these cases we need only compute 7¥2(70), 
where 7 is an appropriately chosen element of the order. Rather than com- 
pute 2(7) directly, we will use the Weber functions to approximate its 
value to within +.5. Since y(7) is an integer, this will determine its value 
uniquely. This scheme for computing these j-invariants is due to Weber 
[102, §125]. 

The ten j-invariants we want to compute are given in the following table: 


70 


—4 i 
—7 | (33+V-7)/2 —153 
~8 J/-2 20° 
—11 | (3+ V-11)/2 —323 
(12.20) —16 2i 66° 
—19 | (34+ V—19)/2 — 96? 
—28 J-7 255 = 3-5-17 255° 
—43 | (34+ /—43)/2 | —960 = —2°.3-5 — 9603 
—67 | (3+ /—67)/2 —5280 = —5280° 
—2°.3-5-11 
—163 | (3+ V—163)/2| -—640320= — 6403208 
—2°.3.5.23-29 


For completeness, here are the j-invariants of the orders of discriminant 
divisible by 3: 


dx J(O) = J) 

-3 | (1+ JV-3)/2 0 
12 V-3 54000 = 24.33.53 
—27 | (1+ 3V—3)/2 — 12288000 = —215.3.53 


We computed j((1+/—3)/2) =0 in §10, and we will prove j(V—3) = 
54000 in §13. As predicted by Theorem 12.13, the last two entries are not 
perfect cubes. 

To start the computation, first consider the case of even discriminant. 
Here, ™ = /—m, where m = 1, 2, 4 or 7. Setting g = e2#V-™ = e—20Vvm | 
we claim that 


(12.21) y2(V—m) = [256q7/3 + q7*7 J, 


where [| ] is the nearest integer function (i.e., for a real number x ¢ Z + 5, 
[|x ] is the integer nearest to x). 
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To prove this, we will write y2(7) in terms of the Weber function f,(7): 


(12.22) 2(V—m) = fo(V—m)'* + ———, ( a 


Using q = e~?v™ as above, (12.15) gives us 
f(W—m) = v2q"/* TT +4"), 
n=1 


and to estimate the infinite product, we use the inequality 1+ x < e* for 
x > 0. This yields 


1<][[G+4")< [[e” =ev/¢-9, 
n=1 n=1 


and we can simplify the exponent by noting that g/(1— q) < q/(1-e727) 
<1.002q since g <e~*". Thus we have the following inequalities for 


fo(/—m): 
V2q1/4 < fo(V—m) < V2q'/4e1-024, 
and applying this to (12.22), we get upper and lower bounds for y2(./—m): 
(12.23) 2562/3 + q7¥/39-8-160 < y9(\/=m) < 256q2/3¢ 169324 4. g@-1/3, 
To see how sharp these bounds are, consider their difference 
E = 256q2/3(¢16324 _ 4) + g-¥/3(4 — e816), 


Using the inequality 


one sees that 
E < 2567/3104 — 1) + q71/8.016q/(1 — 8.016q) 


= 256q7/3(e'04 _ 1) + 8.016q7/3 /(1 — 8.0164). 


The last quantity is an increasing function in g, and then g < e~°" easily 


implies that E < .25. Since 72(,/—m) is an integer, this means that [x ] = 
¥2(V—m) for any x lying between the upper and lower limits of (12.23). In 
particular, 256q7/? + q~1/3 lies between these limits, which proves (12.21). 
Using a hand calculator, it is now trivial to compute the corresponding 
entries in table (12.20) (see Exercise 12.18). 

Turning to the case of odd discriminant, let 7 = (3+ /—m)/2, m=7, 
11, 19, 43, 67 or 163, and we again want to compute 


16 
2(To) = f2(T0)'° + E(to)e 


D. WEBER'S COMPUTATION OF /(1/ — 14) 263 


Our previous techniques won’t work, for g = e2%G+v—™)/2 = _e-tVm ig 
negative in this case. But Weber uses the following clever trick: from 
(12.16), we know that 

V2 


PO = Fy 


and then the transformation properties from Corollary 12.19 imply 
(270) = f4(3 + V—m) = Cg f(2 + V—m) 
= Cg? f,(1+ V=m) = Gg f(/—m). 
Combining the above equations implies that 


= V2¢ 6 
Fa(70) = Jom)’ 
and thus 


256 
Y2(T0) = ime f(V—m)’. 


From here, our previous methods easily imply that if m= 7, 11, 19, 43, 
67 or 163, and g = e~2"V™) then 


y2((3 + V—m)/2) = [[—q-/° + 2564" J], 


where [| ]] is again the nearest integer function. Using a hand calculator, we 
can now complete our table (12.20) of singular j-invariants (see Exercise 
12.18). 


D. Weber’s Computation of j(/—14) 


We next want to compute some singular j-invariants when the class num- 
ber is greater than 1. There are several ways one can proceed. For example, 
when the class number is 2, the Kronecker Limit Formula gives an elegant 
method to determine the j-invariant, and this method generalizes to the 
case of orders with only one class per genus. (Recall from §3 that for dis- 
criminants —4n, this condition means that n is one of Euler’s convenient 
numbers.) For example, when n = 105, Weber [102, §143] shows that 


f(V—105)* = V2-8(14+ V37(1+ V5)9(V3 + V7)3(V5 + V7), 


which would then allow us to compute y2(/—105) and hence j(/—105). 
(The radicals appearing in the above formula are not surprising, since in 
this case the Hilbert class field equals the genus field, which we know by 
Theorem 6.1—see Exercise 12.19.) Other examples may be found in Weber 
[102, pp. 721~—726] or [103], and a modern treatment of the Kronecker Limit 
Formula is in Lang [73, Chapter 20]. 
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We will instead take a different route and compute j(/—14), an ex- 
ample particularly relevant to earlier sections. Namely, K (j(V—14)) is the 
Hilbert class field of K = Q(V—14) since Ox = [1,/—14]. We determined 
this field in §5, so that finding j(,/—14) will give us a second and quite 
different way of finding the Hilbert class field of Q(/—14). Our exposition 
will again follow Weber [102, §144], using ideas from Schertz [87] to give a 
modern proof. 

A key fact we will use is that in many cases, one can generate ring class 
fields using small powers of the Weber functions. Weber gives a long list of 
such theorems in [102, §§8126-127], and modern proofs have been given by 
Birch [7] and Schertz [87]. We will discuss two cases which will be useful to 
our purposes: 


Theorem 12.24. Given a positive integer m not divisible by 3, let O= 

[1, /—m], which is an order in K = Q(./—m). Then: 

(i) For m=6 mod 8, §,(/—my) is an algebraic integer and K (f,(\/—m)’) 
is the ring class field of O. 

(ii) For m= 3 mod 4, f(,/—m) is an algebraic integer and K (f(,/—m)’) is 
the ring class field of O. 


Proof. We begin with (i). Multiplying out the identity 


j(V=m) = /=my = (WOE 


it follows that f,(,/—m) is a root of a monic polynomial with coefficients 
in Z[j(\/—m)]. But j(,/—m) is an algebraic integer, which implies that the 
same is true for f,(,/—m)’. 

We know that L = K(j(/—m)) is the ring class field of [1,/—m], 
and since j(,/—m) is a polynomial in f,(,/—m)*, we need only show that 
f,(/—my lies in L. Actually, it suffices to show that f,(,/—m)° lies in L. 
This is a consequence of Theorems 12.2 and 12.17, for since 3/m, we 
have y2(./—m) € L, and we also know that 


—, _ fi(V—m)* + 16 
R= mE 


When f,(./—m)° lies in L, so does f,(,/—m)*. The above equation implies 
f,(./—m). € L, and then f,(,/—m)* € L follows immediately. 
The next step in the proof is to show that f,(87)° is a modular function: 


Proposition 12.25. {,(87)° is a modular function for the group 1(32). 
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Proof. We first study the transformation properties of f;(7)°. Consider the 


group 
T)(2y 4 b=0 a2} 
= -O=>UMO é 
0( c dad 


In Exercise 12.4, we will show that the matrices 


a=(" J) oe, ve(f 3) 


generate I(2). Using Corollary 12.19, f,(7)° transforms under U and V as 
follows: 


fi(Ur)® = —if,(7)° 
fi(Vr)° =-1 f(t)” 


(see Exercise 12.20.). Then we get the general transformation law for f,(7)°: 


b 
(12.26) fy(yr)® = ite“ C/DPEHEIDM EG FS, = e 1) ERY: 
Cc 


This can be proved by induction on the length of y as a word in —J, U and 
V.A more enlightening way to prove (12.26) is sketched in Exercise 12.21. 
Now consider the function f,(87)°, and let y € I(32). Then 


a b a 8b 
B77 = 8( \r=( ) 8 = 780. 
32c d 4c ad 


Since ¥ € Ip(2)', it follows easily from (12.26) that f,(8y7)° = f,(787)° = 
f,(87T)°, which proves that f,(87)° is invariant under Ip(32). To check the 
cusps, take y € SL(2,Z). Under the correspondence between cosets of (8) 
and matrices in C(8) given by Lemma 11.11, there is 0 € C(8) and 7é 
SL(2,Z) such that 


8yT = Yor. 


Writing 7 as a product of various powers of (51) and ({ 79), the transfor- 
mation properties of Corollary 12.19 imply that 


f(897)° = f,(GoT)? = €f(0T), ef, (07)°, or €f,(07) 
for some root of unity €. Since 0 = (4 *) , where ad = 8, we have 


: 2 
ecmor = Ci gtlA = C4(q' 8) 


and consequently, the product expansions for the Weber functions imply 
that f,(8yT)° is meromorphic in q'/8. This proves that ,(87)° is a modular 
function for Ip(32). Q.E.D. 
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The next step in proving Theorem 12.24 is to determine some field 
(not necessarily the smallest) containing f,(,/—m)°. The key point is that 
f,(87)® is not only a modular function for I)(32), it’s also holomorphic 
and its qg-expansion is integral. Thus Proposition 12.7 tells us that f,(87)° = 
R(j(7), j327)) for some rational function R(X,Y) € Q(X, Y). We will write 
this in the form 


(12.27) f(r)’ = RG(7/8), (47). 
Using Lemma 12.11 with m = 32 and s = 8, we see that 
08. ,. i 
ay Uv—m),j/—m/8)) # 0, 


and thus, by Proposition 12.7, we conclude that 


(12.28) 
fi(V/—m)? = R(j(V—m/8), j(4V—m)) = RU (8, V—m)), i, 4V—m)). 


To identify what field this lies in, let L’ denote the ring class field of the 
order O' = [1,4,/—m]. Since [8,.,/—m] is a fractional proper ideal for O' 
(this uses Lemma 7.5 and m = 6 mod 8—see Exercise 12.22), it follows that 
fi(V/—m)’ € L'. 

We want to prove that f,(,/—m)° lies in the smaller field L. This is the 
situation that occurred in the proof of Theorem 12.2, but here we will need 
more than just a degree calculation. The crucial new idea will be to relate 
Galois theory and modular functions. 

Let’s first study the Galois theory of L c L’. The orders O’ and O have 
discriminants —64m and —4m respectively, so that Corollary 7.28 implies 
that h(—64m)) = 4h(—4m). Thus L’ has degree 4 over L. Furthermore, the 
isomorphisms C(O’) ~ Gal(L'/K) and C(O) ~ Gal(L/K) imply that 


Gal(L'/L) ~ ker(C(O') > C(0)). 


In Exercise 12.22 we show that [4,1+ ./—m] is a proper O’-ideal which 
lies in the above kernel and has order 4. It follows that Lc L’ is a cyclic 
extension of degree 4. 

The goal of the remainder of the proof will be to compute o(f,(,/—m)°) 
for some generator 0 of Gal(L'/L). At the end of §11 we described an 
isomorphism 

C(O") ~ Gal(L'/K) 


as follows. Given a class [a] € C(O’), let the corresponding automorphism 
be o, € Gal(L'/K). If we write L’ = K(j(6)) for some proper fractional 
O'-ideal b, then Corollary 11.37 states that 


a(j(b)) = j(ab). 
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To exploit this, let 6 = [8,/—m] = 8[1, /—m/8], so that (12.28) can be 


written 
fi(V/—m)° = R(j(b), j(’)). 


Now let a = [4,1 + /—m], and let the corresponding automorphism be o = 
J, € Gal(L'/L). Note that o is a generator of Gal(L’/L), and hence to 
prove that f,(,/—m)® lies in L, we need only prove that it is fixed by o. 
Using the above formula for f,(,/—m)*, we compute 


o(fi(V—m)*) = R(a(i(6)),0(F(O'))) = RG (4), j(@)). 
Since m = 6 mod 8, one easily sees that 
ab =[8,-2+J/-—m], G=[4,-1+ /—m] 
(see Exercise 12.22), and hence o(f,(,/—m)°) can be written 


(12.29) 
o(f(V—m)°) = R(j((8,-2 + V=m)), j((4,-1 + V—m)). 
Now let 7 = (j ~7) € In(2)'. If we substitute y7 for 7 in (12.27), we get 


fT)’ = RG (77/8), i477). 
Since yT = (7 — 2)/(7 — 1), one sees that 
[1,77/8] is homothetic to [8(7 — 1),7 — 2] = [8,-2+7] 
[1,4y7] is homothetic to [7-1,4(7 -2)] = [4,-1+7], 


and thus 
fi(y7)’ = RU((8,-2 + 7), 714, -1 + 7). 


Evaluating this at 7 = ./—m and using (12.29), we see that 
o(fy(V—m)") = fy(yv/—m)’. 


However, (12.26) shows that f,(y7)° = f,(7)° for all 7, which proves that 
f,(./—m/ is fixed by o and hence lies in the ring class field L. This com- 
pletes the proof of (i). 

The equation o(f,(,/—m)°*) = f,(y/—m)® used above is significant, for 
it allows us to compute the action of 0 € Gal(L'/K) using the matrix 7 € 
SL(2,Z). This correspondence between Galois automorphisms and linear 
fractional transformations is not unexpected, for the f,(y7)°s are the con- 
jugates of f,(7)® over Q(j(7)), hence when we specialize to T = /—m, the 
conjugates of f,;(,/—m)° should lie among the f,(7./—m)°’s. What’s surpris- 
ing is that there’s a systematic way of finding 7. This is the basic content 
of the Shimura Reciprocity Law. A complete statement of the theorem re- 
quires the adeles, so that we refer the reader to Lang [73, Chapter 11] or 
Shimura [90, §6.8] for further details. 
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The proof of (ii) is similar to what we did for (i), though this case is a 
little more difficult. We will sketch the main steps of the proof in Exercise 
12.23. This completes the proof of Theorem 12.24. Q.E.D. 


We can now begin Weber’s computation of j(W—14) from [102, §144]. 
Let K = Q(V-14). Since Ox = [1,V-14], L = K(j(V—14) is the Hilbert 
class field of K. As we saw in §5, Gal(L/K)~ C(Ox) is cyclic of order 
4. Furthermore, we can use the results of §6 to determine part of this ex- 
tension. Recall that the genus field M of K is the intermediate field K C 
M CL corresponding to the subgroup of squares. When K = Q(/—14), 
Theorem 6.1 tells us that M = K(./—7) = K(V2). Thus 


K CK(v2) CL. 


We will compute f,(\/—14)’, which lies in the Hilbert class field L since 
m = 14 satisfies the hypothesis of the first part of Theorem 12.24. Let o be 
the unique element of Gal(L/K) of order 2, so that the fixed field of o is 
the genus field K(/2). The key step in the computation is to show that 


(12.30) o(f,(V—14)’) = fo(V= 14/2). 
We start with the equation 
fi(v—m)? = R(i(6),J(0')) 


from Theorem 12.24, where O! = [1,4V—14] and 6 = [8, /—14]. If O’ and 
L' are as in the proof of Theorem 12.24, then b determines a class in C(O’) 
and hence an automorphism oy € Gal(L'/K). It is easy to check that b 
maps to the unique element of order 2 in C(Ox) (see Exercise 12.24), and 
consequently, the restriction of 0, to L is the above automorphism a. By 
abuse of notation, we will write 0 = 0,. Then, using Corollary 11.37, we 
obtain 


o(f1(V—14)°) = R(j(bb), j(6)) = RUG(O’), j(®)) 
since b = b and bb = [2,8,/—14] = 20’. Thus 
(12.31) o(fi(V—14)°) = RG((1, 4V—14)), 7((8, V—14))). 


Let 7 = © ea , and note that f(7) = f,(y7) by Corollary 12.19. Combining 
this with (12.27), we get 


fa(r)? = fy? = RUGT/8), (477) 
= RG((1,8r)), J(14,7)))- 
Evaluating this at 7 = /—14/2 and using (12.31), we obtain 
o(fi(V—14)") = fo(W—14/2)°. 
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If we take the cube root of each side, we see that 
o(fu(V—14)’) = Gha(V—14/2)’ 


for some cube root of unity ¢). It remains to prove that the cube root is 1. 
From (12.16) we know that f,(7)f,(7/2) = V2, so that 


(12.32) — f4(W—14)?o(F,(W—14/2)") = Gh (WV — 14)" fo(W— 14/2)" = 25. 


Since f,(W—14)’o(f “ty is fixed by a, it lies in K(/2), and hence 
Ge K(Vv2) = Q(V2,/—7). This forces the root of unity to be 1, and 
(12.30) is proved. 

Now let a = f,(\/—14)*. From (12.32) we see that ao(a@) = 2, so that 
a+o(a)=a+2/a lies in K(V2). But a is clearly real, so that a + 2/a€ 
Q(V2), and furthermore, a and 2/a = (a) are algebraic integers by The- 
orem 12.24. It follows that 


2 
(12.33) ae bV2, a, be Z. 


We will use a wonderful argument of Weber to show that a and b are both 
positive. Namely, (12.33) gives a quadratic equation for a, and since @ is 
real and positive (see the product formula for f,(7)), the discriminant must 
be nonnegative, i.e., 

(a+ bV2)y > 8. 
Let o; be a generator of Gal(L/K . (so o = 07). Then 0;(V2) = ~VJ2, and 
hence 


ai(a)+ — aw =a—by2. 


But 0;(a@) cannot be real, for then LMR = Q(a) would be Galois over Q, 
which contradicts Gal(L/Q) ~ Dg (see Lemma 9.3). Thus the discriminant 
of the resulting quadratic equation must be negative, i.e., 

(a —bV2) <8. 
Subtracting these two inequalities gives 


4abV2 > 0, 


so that a and b are positive since a > 0. 

As a and b range over all positive integers, the resulting numbers a + 
bV/2 form a discrete subset of R (by contrast, Z[V2] is dense in R). Thus we 
can compute a and b by approximating a + 2/qa sufficiently closely. Setting 


q=entVls, (12.15) implies 


= fy V= 14/2)? = 2g TT 1+ a". 
n=1 
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Applying the methods used in our class number 1 calculations, we see that 


1/12 1/12 ,2.002q 
é ’ 


Z 
2 —<2 
q | 


and thus 
gr 2e-2.0029 <q < gr, 


These inequalities imply that 
at we ql? 4 2gl/!2 ~ 2.6633 + .7509 = 3.4142, 
with an error of at most 10~4. Compare this to the smallest values of a + 
bV/2, a,b > 0: 
14+ V2 % 2.4142 <24+ V2% 3.4142 < 14+ 2V2 = 3.8284. 

It follows that a + 2/a = 2+ V2, and then the quadratic formula implies 

_ 24+ V2+V4v2—-2 _ V2 4144/2241 

2 2 


Since a % 2.6633 is the larger root, we have 


a = f,(V-1 4) = v2+1+ V2v2-1 v2v2—1 
V2 
and we can now compute 72(/— 14): 


/_14 = /_14 16 
Y2( 14) = fi( 14) + ie 


4 
=a + Babs (=) 
a a 


2 ¥24+1+V2vV2-1 o V2 La 2—A : 
V2 v2 


=2 (323 + 228V/2 + (2314 161V2)\/ 2V2 — 1) ; 


where the last step was done using REDUCE. Cubing this, we get the for- 
mula for j(V—14) given in (12.1). 

One corollary is that L = K(\/2V2 — 1) is the Hilbert class field of K = 
Q(V—14). This method of determining L is more difficult than what we did 
in §5, but it’s worth the effort—the formulas are simply wonderful! These 
same techniques can be used to determine j(/—46) and j(—142) (see 
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Exercise 12.25), and in [102, §144] Weber does 7 other cases by similar 
methods. 

The examples done so far represent only a small fraction of the singular 
j-invariants computed by Weber in [102]. He uses a wide variety of methods 
and devotes many sections to computations—the interested reader should 
consult §8125, 128, 129, 130, 131, 135, 139, 143, and 144 for more examples. 
We should also mention that in 1927, Berwick [4] published the j-invariants 
(in factored form) of all known orders of class number < 3. For a modern 
discussion of how to compute singular moduli, see Herz [55]. 


E. Imaginary Quadratic Fields of Class Number 1 


We will end this section with another application of the Weber functions: 
the determination of all imaginary quadratic fields of class number 1. 


Theorem 12.34. Let K be an imaginary quadratic field of discriminant dx. 
Then 


h(dx) = 1<> dx = —3,—-4,-7,-8, -11,—19, —43, -67, — 163. 


Remark. As we saw in Theorem 7.30, this theorem enables us to determine 
all discriminants D with h(D) = 1. 


Proof. This theorem was first proved by Heegner [52] in 1952, but his proof 
was not accepted at first, partly because of his heavy reliance on Weber. 
In 1966 complete proofs were found independently by Baker [3] and Stark 
[96], which led people to look back at Heegner’s work and realize that he 
did have a complete proof after all (see Birch [6] and Stark [98]). We will 
follow Stark’s presentation [98] of Heegner’s argument. 

The first part of the proof is quite elementary. Let dx be a discriminant 
such that h(dx) = 1. Recall from Theorem 2.18 that h(—4n) = 1 if and only 
if —4n = —4, —8, —12, —16 or —28. Thus, if dx =0 mod 4, then dx = —4 
or —8 since dx is a field discriminant. So we may assume dx = 1 mod 4, 
and then Theorem 3.15 implies that there are 2/—! genera of forms of dis- 
criminant dx, where p is the number of primes dividing dx . Since h(dx) = 
1, it follows that w = 1, so that dx = —p, where p = 3 mod 4 is prime. 

If p =7 mod 8, then Theorem 7.24 implies that 


h(—4p) = 2h(~p)(1- (SP) 5) =r) =1, 


and using Theorem 2.18 again, we see that p = 7. 
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We are thus reduced to the case p =3 mod 8, and of course we may 
assume that p # 3. Then Theorem 7.24 tells us that 


h-4p) = 2h(-p) (1- (SP) 5) = 3m(-p) =3. 


This implies that Q(j(,/—p)) has degree 3 over Q. By the second part of 
Theorem 12.24, we know that f(,/—p)* € K(j(/—P)), and since f(,/—p)* 
is real, we see that f(,/—p)* generates a cubic extension of Q. 

Let ™ = (3+ /—p)/2, and set a =¢,f,(7)*. We can relate this to 
{(,/—p)’ as follows. We know from (12.16) that 


f1(270)f2(70) = V2, 
and Corollary 12.19 tells us that 


f,(270) = f1(3 + V—P) = Gig (VP) = Ge f(V—P)- 


These formulas imply that a = 2/f(,/—p)*, and hence a@ generates the cu- 
bic extension Q(f(,/—p)*). Note also that a* generates the same cubic ex- 
tension. 

Let’s study the minimal polynomial of a*. Since O = [1,7] and h(—p) = 
1, we know that j(70) is an integer, and then 72(70) is also an integer by 


Theorem 12.2. Since yo 
_ f2(7 + 16 
vm) fo(mo)® 


it follows that at = —f,(7)° is a root of the cubic equation 
(12.35) x? — 72(7)x — 16 = 0. 


This is the minimal polynomial of a4 over Q. 
However, a@ is also cubic over Q, and thus satisfies an equation of the 
form 
xe+ax*+bxt+e = 0, 


where a, b and c lie in Z since a is an algebraic integer. Heegner’s insight 
was that this equation put some very strong constraints on the equation 
satisfied by a*. In fact, moving the even degree terms to the right and 
squaring, we get 

(x3 + bx)? = (-ax* —c)’, 


so that a@ satisfies 
x° + (2b—a?)x* + (b? — 2ac)x* —c? =0. 
Hence a? satisfies the cubic equation 


wetexrt+fxt+g=0, e=2b-a’, f=b’-2ac, g=-c’, 
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and repeating this process, we see that a4 satisfies the cubic equation 
x? + (2f — e?)x? + (f? — 2eg)x —g?. 


By the uniqueness of the minimal polynomial, this equation must equal 
(12.35). Comparing coefficients, we obtain 


2f - e? =0 
(12.36) f? —2eg = —72(70) 
g* = 16. 


The third equation of (12.36) implies g = +4, and since g = —c”, we have 
g = —4 and c = +2. However, changing a to —a leaves a’ fixed but takes 
a,b,c to —a,b,—c. Thus we may assume c = 2, and it follows that 


42(1) = —f? — 8e = —(b* — 4a)’ — 8(2b — a’). 


It remains to determine the possible a’s and b’s. 
The first equation 2f = e” of (12.36) may be written 


2(b* — 4a) = (2b — ay’, 
which implies that a and b are even. If we set X = —a/2 and Y = (b- 
a*)/2, then a little algebra shows that X and Y are integer solutions of the 
Diophantine equation 
2X(X°+1)=Y? 


(see Exercise 12.26). This equation has the following integer solutions: 


Proposition 12.37. The only integer solutions of the Diophantine equation 
2X (X3 +1) = Y* are (X,Y) = (0,0), (—1,0), (1,42), and (2,+6). 


Proof. Let (X,Y) be an integer solution. Since X and X? + 1 are relatively 
prime, the equation 2X(X? + 1) = Y? implies that +(X? +1) is a square 
or twice a square. Thus (X,Y) gives an integer solution of one of four Dio- 
phantine equations. These equations, together with some of their obvious 
solutions, may be written as follows: 

(i) X34+1=Z%, (X,Z) =(-1,0), (0,£1), (2,3). 

(ii) X3+1=—-Z*, (X,Z) =(-1,0). 
(iii) W°+1=2Z*, (W,Z)=(1,+1). 
(iv) X341=-2Z%, (X,Z) =(-1,0). 
To explain (iii), note that if X¥7+1=2Z*, then 2X(X? +1) = Y? implies 
that X = W? for some W, which by substitution gives us W® + 1 = 2Z?. In 
Exercises 12.27-12.29, we will show that the solutions listed above are all 
integer solutions of these four equations. Once this is done, the proposition 
follows easily. 
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The integer solutions of equations (ii)-(iv) are relatively easy to find. 
We need nothing more than the techniques used when we considered the 
equation Y* = X>— 2 in Exercises 5.21 and 5.22. See Exercise 12.27 for the 
details of these three cases. 

The integer solutions of equation (i) are more difficult to find, and the 
elementary methods used in (ii)}-(iv) don’t suffice. Fortunately, we can turn 
to Euler for help, for in 1738 he used Fermat’s technique of infinite descent 
to determine all integer (and rational) solutions of (i) (see [33, Vol. II, pp. 
56-58]). A version of Euler’s argument may be found in Exercises 12.28 
and 12.29. This completes the proof of the proposition. Q.E.D. 


Once we know the solutions of 2X (X? + 1) = Y”, we can compute a, b 
and hence 72(79). This gives us the following table: 


b=4X?4+2Y | 72(m) = 
—(b* — 4a)? — 8(2b — a”) 


Note that these y2(7)’s are among those computed earlier in table (12.20). 
Since j(Ox) determines K uniquely (see Exercise 12.30), it follows that we 
now know all imaginary quadratic fields of class number 1. This proves the 
theorem. Q.E.D. 


Note that Heegner’s argument is clever but elementary—the hard part 
is proving that f{(,/—p)* lies in the appropriate ring class field. Thus Weber 
could have solved the class number 1 problem in 1908! We should also 
mention that there is a more elementary version of the above argument 
which makes no use of the Weber functions (see Stark [98]). 


F. Exercises 


12.1. Show that g2(7), g3(7) and A(r) are real-valued when 7 is purely 
imaginary. Hint: use Exercise 11.1. 


12.2. Let F(q)=1+ >092, ang" be a power series which converges in a 
neighborhood of the origin. 
(a) Show that for any positive integer m, there is a unique power 
series G(q), converging in a possibly smaller neighborhood of 0, 
such that F(q) = G(q)” and G(0) = 1. 


12.3. 


12.4. 
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(b) If in addition the coefficients of F(q) are rational numbers, 
show that the power series G(q) from part (a) also has ratio- 
nal coefficients. 


In this exercise we will prove that S = ({~}) and T = (43) generate 


SL(2,Z). To start, let [ be the subgroup of SL(2,Z) generated by S 
and 7°. 


(a) Show that every element of SL(2,Z) of the form (5 >) or (eo) 
lies in I. 


(b) Fix yo € SL(2,Z), and choose y €T' so that yo = (42) has the 
minimal |c|. 
(i) If a = 0 or c = 0, then use (a) to show that y ET. 
(ii) If c #0, then, of the y's that give the minimal |c|, choose 
one that has the minimal |a|. Use 


rs (‘ # : wa ") 
c da Cc * 
to show that |a| > |c|, and then use 
a b —C * 
ea hea 
c d a * 
to show that a = 0. Conclude that y ET. 
(c) Use (a) and (b) to show that S and T generate SL(2,Z). 
In this exercise we will give generators for the following subgroups 
of SL(2,Z): 
a b 
To(2) = { ( i. € SL(2,Z):c = 0 mod 2} 
Cc 
a b 
Kh(2y = { ( ,) € SL(2,Z): b= 0 mod 2} 
c 


T'(2) = i :) €SL(2):b=¢=0mod 2}. 


Let I= (91), 4 = (11) and B = (9;). 

(a) Modify the argument of Exercise 12.3 to show that —J, A? and 
B generate Ip(2). Hint: let T be generated by —J, A” and B. 
Given Yo € Ip(2), choose 7 € I so that 77 = (2 *) is minimal in 
the sense of Exercise 12.3. If c # 0, show that |a| < |c|, and then 


use 
Cc d Cc + 2a * 
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12.5. 


12.6. 


12.7. 


12.8. 


12.9. 
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to prove that a = 0, which is impossible in this case. 


(b) Show that —J, A and B’ generate Ip(2)'. In the text, these gen- 
erators are denoted —J, U and V respectively. 


(c) Adapt the argument of (a) to show that —/, A? and B? generate 
P'(2) 
This exercise is concerned with the properties of 72(7). 


(a) Prove (12.6) by induction on the length of (2) as a word in the 
matrices S and T of Exercise 12.3. 


(b) Use (12.6) to show that 72(7) is invariant under the group 


ra={ (4 ) b=¢=0mod 3}. 


wor-("P °)r0(3 °) 


and conclude that y2(37) is invariant under Ip(9). 
(d) Use (12.6) to show that the exact subgroup of SL(2,Z) under 
which 72(7) is invariant is 


(c) Show that 


b 
He ) €SL(Z):a=d = 0mod 3 or b= ¢ mod 3} 
c 


Complete the proof of part (ii) of Proposition 12.7 using the hints 
given in the text. 


Let O = [1,79] be an order of discriminant D in an imaginary qua- 
dratic field, and assume that 7) = /—™m or (3 + /—m)/2, depending 
on whether D = 0 or 1 mod 4. Let O' = [1,379] be the order of index 
3 in O. If 3/D, then prove that [1,7/3] is a proper fractional O'- 
ideal. Hint: use Lemma 7.5. 


Adapt the argument of Lemma 12.11 to show that 


O®y e e bd e 
= 0. 
ay Gi)» iG/3)) 4 

Hint: it suffices to show that (12.12) cannot hold. 


This exercise is concerned with the elementary properties of the We- 
ber functions. 


(a) Prove the product expansions (12.15). 


(b) Prove the top line of (12.16). Hint: use the product exansions to 
show that 


n(T)§(T)fi(7)fo(7) = V20(7). 
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(c) Prove the bottom line of (12.16). Hint: use the definitions. 


12.10. Exercises 12.10, 12.11 and 12.13 are concerned with the Weierstrass 
o-function. The basic properties of o(z;7) will be covered, though 
we will neglect the details of convergence. For careful treatment 
of this material, see Chandrasekharan [16, Chapter IV], Lang [73, 
Chapter 18], and Whittaker and Watson [109, Chapter XX]. As in 
the text, the o-function is defined by 


o(z;T)=z J] (1- =) e2/w+(1/2)(2/w) 
weEL—{0} 


where L = [1,7]. Note that o(z;7) is an odd function in z. We will 
write o(z) instead of o(z;7). 
(a) Define the Weierstrass ¢-function ¢(z) (not to be confused with 
Riemann’s) by 
o'(Z) 


Using the definition of o(z), show that 


(@=s+ a (5 +2+3). 


weEL—{0} 


(b) Show that the ¢-function is related to the g-function by the 
formula 
(Zz) = —C'(Z). 


(c) By (b), it follows that if we L, then ¢(z +w)—(¢(z) is a con- 
stant depending only on w. Since L = [1,7], we define 7, and 
m2 by the formulas 


m = C(z +7)—¢(2z) 
m2 = G(z + 1)— ¢(z). 
Then prove Legendre’s relation 
N2T —M = 271. 


Hint: consider f,.¢(z)dz, where T’ is the boundary, oriented 
counterclockwise, of the parallelogram P used in the proof of 
Lemma 10.4. By standard residue theory, the integral equals 
2mt by (a). But the defining relations for 7, and 7 allow one 
to compute the integral directly. 


(d) We can now show that 
o(z+T) = —eN@+F)g(z) 


o(z +1) = —eR@*g(z), 
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(i) Show that 
o(z+T) 


4 o@+7)_, oe +7) 
a(z) ’ 


dz oa(z) 
and conclude that for some constant C, 
a(z +7) = Ce™’o(z). 


(ii) Determine the constant C in (i) by evaluating the above 
identity at z = —7/2. This will prove the desired formula 
for o(z + T). Hint: recall that o(z) is an odd function. 


(iii) In a similar way, prove the formula for o(z + 1). 
The goal of this exercise is to prove the formula 


_o(zZ + w)o(z — w) 


9(z) P= p(w) = o2(z)a2(w) 


Fix w ¢ L = [1,7], and consider the function 


(z+ w)a(z—w) 
o*(z)o*(w) 


f(2)=- 


(a) Show that f(z) is an even elliptic function for L. By Lemma 
10.17, this implies that f(z) is a rational function in ¢(z). 

(b) Show that f(z) is holomorphic on C — L and that its Laurent 
expansion at z = 0 begins with 1/z?. 

(c) Conclude from (b) that f(z) = e(z)+C for some constant C, 
and evaluate the constant by setting z= w. This proves the 
desired formula. 


Use the previous exercise to show that 


( 


:) 
Pel 


o2 
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Hint: for e2 — e;, use the fact that 
on (- =") =O (- 1 a + 1) — —emn(—(7+1)/241/2) 5 (- 1 =") 
= e mtl2zg T+1 
5 , 


The final fact we need to know about the o-function is its g-product 
expansion 


| : 2-1/2. py (1—9"%qGz)(1—- 9g? 
ozs) = sper Mgt? — ge yT] Re eae) 
T 


where g, = e?™'7 and q, = e?™?. To prove this, let f(z) denote the 
right-hand side of the above equation. 


(a) Show that the zeros of f(z) and o(z) are exactly the points of 
L. Thus o(z)/f(z) is holomorphic on C — L. 

(b) Show that o(z)/f(z) has periods L = [1,7]. 

(c) Show that o(z)/f(z) is holmorphic at z = 0 and takes the value 
1 there. 

(d) Conclude that o(z) = f(z). Hint: use Exercise 10.5. 


n=1 


This exercise will complete the proof of the formulas (12.18) ex- 
pressing e; — e; in terms of 7(7) and the Weber functions. 


(a) Use the product expansion from Exercise 12.13 to show that 


0 (5) _ J o/s f(T)” 


2) dn try’ 
2 
T\) For? —1/8 f1(7) 
. (5) an q n(Ty 
{PER hs J om(r+1?/8g—1/8 f(r)’ 
Bt, “Gs mr) 


(b) Use (a) and the formulas from Exercise 12.12 to prove 
€2 —e, = 1°n(T)*f(7)® 
en —e3 = (7) f(T)» 
e3—€, = 1'(T)*f2(7)°. 
This proves (12.18). Hint: use (12.16). 


In this exercise we will complete the proof of Theorem 12.17. Re- 
call from Exercise 10.8 that g2(7T) = —4(e1e2 + e1€3 + €2€3) and e; 
+e27+e3=0. 
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12.20. 
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(a) Show that 
3g2(T) = 4((e2 — e1)” — (€2 — €3)(e3 — e1)). 


(b) The identity of part (a), together with the formulas for e, — e,, 
were used in the text to derive a formula for 72(7) in terms 
of f(7). Find two other identities for 3g2(7) similar to the one 
given in part (a), and use them to derive formulas for (7) in 
terms of f;(7) and f,(7). 


Use the formulas for y2(7) given in Theorem 12.17 to show that the 
q-expansion of the j-function has integral coefficients. This proves 
Theorem 11.8. 


Complete the proof of Corollary 12.19. 
Verify the calculations made in table 12.20. 


Use Theorem 6.1 to determine the Hilbert class field of K = 
Q(/—105), and show that its maximal real subfield is Q(V3, V5, 
V7). Hint: use Theorem 3.22 to show that the genus field equals 
the Hilbert class field in this case. 


This exercise is concerned with the properties of the Weber func- 

tion f,(r). Let 7 = (44), U = (i¢) and V = (4%). 

(a) Use Corollary 12.19 to show f,(UT)® = {,(V'7)° = —if,(7)°. 

(b) In Exercise 12.4 we proved that —7, U and V generate [(2)'. 
Use induction on the length of y = (45) € Ip(2)' as a word in 
—I,U and V to show that 


oe 2 2 
fi(yr)° = p74 (1/2)bd +(1/2)b EAR): 
In this exercise we will show how to discover the transformation 


law for f,(7) proved in part (b) of Exercise 12.20. Let —7J, U and 
V be as in Exercise 12.20. We will be using the groups 


r(2) = oe ;) € SL(2,Z):b =c =0 mod 2} 


re)={(" , €SL(2.2):b=¢=0 mod 8h. 


Note that ['(8) C (2) C T(2)', and recall from Exercise 12.4 that 

—J, U? and V generate ['(2). 

(a) Show that [(2) has index 2 in Ip(2)' with J and U as coset 
representatives. 
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(b) Show that T(8) is normal in SL(2,Z) and that the quotient 
T'\(2)/T'(8) is Abelian. Hint: compute [U*,V]. 
(c) We can now discover how f,(7)° transforms under 7 = (¢ ”) E 
I\(2). Write 
Ss 
y=t[[u7v", 
i=1 
and set A = )*}_,a; and B = $>7_,5;. 
(i) Show that f,(yr)° = i-24-f,(7)°. 
(ii) Use (b) to show that y = U?4V mod [(8), which means 


that 
a b 1 2B se 
= mod I°(8). 
c d 2A 1+4AB 


(iii) Use (ii) to show that ac =2A mod 8 and bd = 2B mod 8. 
(iv) Conclude that for all y € ['(2), 


fi, (yr)° me j74ae—(1/2)ba f(r)’. 


d) Now take 7 = (72 E€ 1y(2/, I'(2). By (a), we can write 
( Y= (oa 7 y 
y = U¥ for some ¥ € ['(2). Then use (c) to show that 


eae 2 
fi(qT)® — {74 (1/2)bd +(1/2)b f(r)’. 


Hint: observe that a* = 1 mod 4 in this case. 
(e) - Beaed the formulas of (c) and (d), take y = (2°) €19(2)'. 
Ow that 

dp 0 mod 4 yéET(2) 
| 4b? mod 4 y ¢T(2). 

From here, it follows immediately that 
f, (yr)° _ j—4¢-(1/2)bd +(1/2)b 6, (76 

for all y € Ip(2)'. 


12.22. Let O =[1, /—m] and O' = [1,4,/—m], where m > 0 is an integer 
satsifying m = 6 mod 8. Note that 0’ is the order of index 4 in O. 
Let a = [4,1+ /—m] and b = [8, /—m]. 
(a) Show that a and b are proper fractional O’-ideals. Hint: use 
Lemma 7.5. 
(b) Show that the class of a has order 4 in C(O’) and is in the 
kernel of the natural map C(O’) > C(O). 


(c) Verify that ab = [8,—2 + /—m] and a = [4,-1+ /-—m]. 
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In this exercise we will prove part (ii) of Theorem 12.24. We are 

thus concerned with f(,/—m)?, where m=3 mod 4 is a positive 

integer not divisible by 3. Let L denote the ring class field of the 

order O = [1, /—m]. 

(a) Show that f(,/—m)° € L implies that L = K(f(./—m)). 

(b) By Corollary 12.19, we have f(7)* = ¢ f,(7 + 1)®. Use this to 
prove that {(87)° is a modular function for I)(64). Hint: show 
that f,(7)° is invariant under 


(8) = 16 a) € SL(2,Z): (! -) = & ) mod st. 


Since ['(8) is normal in SL(2,Z), this implies that f(7)° is also 
invariant under ['(8). 
(c) Use Proposition 12.7 and Lemma 12.11 to show that 


f (V=my = S(j([8, V—m)), j([1,8V—m))) 


for some rational function S(X,Y) € Q(X,Y). 

(d) Let O’ be the order [1,8,/—m]. Show that a = [8,2 + /—m] 
and b = [8,,/—m] are proper fractional O’-ideals. Then use 
(c) to conclude that f(,/—m)® lies in the ring class field L’ 
of O'. 

(e) Show that the extension LC L’ has degree 8 and that under 
the isomorphism C(O’) ~ Gal(L'/K), the classes of the ideals 
a and b map to generators 0; and 02 of Gal(L'/L). Thus we 
need to prove that f(,/—m)° is fixed by both o; and 2. 

(f) Using (c) and Corollary 11.37, show that 


o1(f(V—m)”) = S(j([4,3 + V=m)), (8.6 + V—m))) 
72 f(V—m)") = S(i((1, 8V=m)), j([8, V—m))) 
(this is where m = 3 mod 4 is used). 
(g) Let 71 = (7'¢) and 72 = ({ ~4)- Then show that 
fmt)’ = S143 + tT), J(186 + 7])) 
f(r)’ = S(i(1,87)), (8,7). 
(h) Use Corollary 12.19 to show that f(7)° is invariant under both 


4 and 72. Then (f) and (g) imply that f(,/—m)° is fixed by o; 
and 02, which completes the proof. 


Let O =[1,/—14] and O’ = [1,4,/—14]. By part (a) of Exercise 
12.22, we know that b = [8,. /—14] is a proper fractional O’ ideal. 


12.25. 


12.26. 


12.27. 


12.28. 
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Under the natural map C(O’) — C(O), show that 6 maps to the 
unique element of order 2 of C(O). 


Compute j(/—46) and j(/—142). Hint: in each case the class 
number is 4. Note also that 46= 142 =6 mod 8, so that part (i) 
of Theorem 12.24 applies. 


Let (a,b) be a solution of the Diophantine equation 2(b* — 4a) = 

(2b — a2)? 

(a) Show that a and b must be even. 

(b) If we set X = —a/2 and Y = (b—a’)/2, then show that X and 
Y are integer solutions of the Diophantine equation 2X (X? + 
1) =Y?. 


This exercise will discuss three of the Diophantine equations that 
arose in the proof of Proposition 12.37. In each case, the meth- 
ods used in Exercises 5.21 and 5.22 are sufficient to determine the 
integer solutions. 


(a) Show that the only integer solutions of X3+1=—Z? are 
(X,Z) = (—1,0). Hint: work in the ring Z[7]. 

(b) Show that the only integer solutions of W°+1=2Z? are 
(W,Z) = (£1, +1). Hint: work in the ring Z[w], w = e?"/>. The 
fact that 3/W? + 1 will be useful. 

(c) Show that the only integer solutions of X¥3+1=-—2Z? are 
(X,Z) = (—1,0). Hint: work in the ring Z[/—2]. 


Exercises 12.28 and 12.29 will present Euler’s proof [33, Vol. II, pp. 

56-58] that the only rational solutions of X3 + 1= Z* are (X,Y) = 

(—1,0), (0,11) and (2,+3). In this exercise we will show that there 

are no relatively prime positive integers c and b such that bc(c? — 

3bc + 3b’) is a perfect square when c # b and 3/c. The proof will 
use infinite descent. Then Exercise 12.29 will use this result to study 

X341=Z*. 

(a) Assume that c and b are positive relatively prime integers such 
that bc(c? — 3bc + 3b*) is a perfect square, and assume also 
that c # b and 3/c. Show that b, c and c* — 3bc + 3b” are rel- 
atively prime, and conclude that each is a perfect square. Then 
write c?—3bc + 3b? =(@b-—c)*, where n>0, m>O and 
gcd(m,n) = 1. Show that this implies 

b — 2mn— 3n? 


Cc m2 — 3n2 ~ 


There are two cases to consider, depending on whether 3/m or 
3|m. 
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(b) Preserving the notation of (a), let’s consider the case 3m. 
(i) Show that b = 2mn — 3n? and c = m? — 3n?. 
(ii) Since c is a perfect square, we can write m? —3n? = 
(2n—m), where p>0, g>0O and ged(p,q)=1. 
Show that p and q may be chosen so that 3/ p, and show 


also that im pre 392 
n pq - 
(iii) Prove that 
b _ p*-3pq +3q’ 
n? Pq 


and conclude that pq(p* — 3pq + 3q’) is a perfect square. 
Show also that p # q. Hint: use (i) and (ii) to show that 
P =q implies c =3. 

(iv) By (ii) and (iii) we see that p and q satisfy the same con- 
ditions as c and b. Now prove that gq <b, which shows 
that the new solution is “smaller”. Hint: note that q | b, so 
that gq < b unless gq = n=b. Use (i) and (ii) to show that 
c = 3 in this case. 

(c) With the same notation as (a), we will now consider the case 

3|m. Then m = 3k, so that by (a), 


b n*—2nk 


cn? — 3k?’ 
Since 3/n, the argument of (b) implies that b = n? — 2nk and 
c = n’ — 3k”, and since c isa perfect square, we can write n? — 
3k? = (2k —n)’, where p>0, q>0 and gcd(p,q) = 1. As in 
(b), we may assume 3/p, and we also have 

k= 2=pq * 

(i) Show that 
b _ p’—4pq + 3q* _ (p—4)(p ~ 39) 


n2 p? +3q p* + 3q? 


b] 


and conclude that (p — q)(p — 3q)(p* + 3q’) is a perfect 
square. 


(ii) Let ¢ = |p —q| and u = |p — 3q|. Show that 
(p — 4)(p — 34)(p* + 3q*) = tu(u? — 3tu + 32°). 


Show also that 3/u and that ¢ and u are positive and un- 
equal. 


12.29. 


12.30. 
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(ii) It follows from (1) and (ii) that u and f, divided by their 
greatest common divisor, satisfy the same conditions as c 
and b. Now prove that t < b, so that the new solution is 
“smaller”. Hint: consider the cases f = g — p and t = p— 
q separately. In the latter case, note that p|n+Vn* — 3k2, 
and that p =n + V/n* — 3k? implies q=k. 

Thus, given c and 5b satisfying the above conditions, we can always 
produce a pair of integers satisfying the same conditions, but with 
strictly smaller b. By infinite descent, no such c and 6 can exist. 


We can now show that the only rational solutions of X7+1= Z° 
are (X,Y) = (—1,0), (0,41) and (2,43). Let (X,Y) be a rational 
solution, and write X = a/b, where b> 0 and gcd(a,b) = 1. As- 
sume in addition that a/b # —1, 0 or 2, and set c=a+b. Our 
goal is to derive a contradiction. 
(a) Show that b(a? + b?) = bc(c? — 3bc + 3b’) is a perfect square 
and that b and ¢ are relatively prime, positive, and unequal. 
(b) It follows from Exercise 12.28 that 3|c. Then c = 3d and 3/b. 
Show that bd(b* — 3bd + 3d’) is a perfect square, and use Ex- 
ercise 12.28 to show that b=d. This implies b= d = 1, and 
hence c = 3. Then a/b =2, which contradicts our initial as- 
sumption. 


If K and K’ are imaginary quadratic fields and j(Ox) = j(Ox:), 
then prove that K = K’. Hint: use Theorem 10.9. 


§13. THE CLASS EQUATION 


Now that we have discussed singular j-invariants and computed some exam- 
ples, it is time to turn our attention to their minimal polynomials. Given an 
order © in an imaginary quadratic field K, Ho(X) will denote the monic 
minimal polynomial of j/(Q) over Q. Note that Ho(X) has integer coeffi- 
cients since j() is an algebraic integer. The equation Ho(X ) = 0 is called 
the class equation, and by abuse of terminology we will refer to Ho(X) as 
the class equation. Since © is uniquely determined by its discriminant D, 
we will often write Hp(X) instead of Ho(X). 

For an example of a class equation, consider the order Z[\/—14] of dis- 
criminant —56. It’s j-invariant is j(¥—14), which we computed in (12.1). 
Thus the minimal polynomial of j(V—14) is 


(13.1) 


H_56(X) = X4—2°.19.937-3559.X3 + 2 . 251421776987 X° 


+ 279.3.11°.19-21323.X +(2°-117- 17-41), 
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where the coefficients have been factored into primes. Note that the con- 
stant term, being the norm of j(V—14) = 72(V—14)’, is a cube by Theorem 
12.2: 

The first part of 813 will describe an algorithm for computing the class 
equation Hp(X) for any discriminant D. We have a special reason to be 
interested in this question, for by Theorem 9.2, the polynomial H_4,(X) 
gives us the criterion for when a prime is of the form x? + ny*. Thus our 
algorithm will provide a constructive version of Theorem 9.2. In the second 
part of §13, we will discuss some more recent work of Deuring, Gross and 
Zagier on the class equation. We will see that there are strong restrictions 
on primes dividing the discriminant and constant term of the class equation. 
The small size of the primes appearing in the constant term of (13.1) is thus 
no accident. 


A. Computing the Class Equation 


We will begin by giving a more precise description of the class equation: 


Proposition 13.2. Let O be an order in an imaginary quadratic field K, and 
let aj, i = 1,...,h be ideal class representatives (so that h is the class number). 
Then the class equation is given by the formula 


h 
Ho(X) = |] (X - j(ai)). 
i=1 


Proof. This result is an easy consequence of Corollary 11.37 (see Exercise 
13.1), but there is a more elementary argument which we will now give. 
By Theorem 11.1, K(j(Q)) is the ring class field of O. Thus [K(j(Q)): 
K]=h, and since j(Q) is real, it follows that [Q(j(Q)): Q] = h. This shows 
that Ho(X) has degree h. Now let a be a root of Hy(X), and let o be an 
automorphism of C that takes 7(O) to a. In the proof of Theorem 10.23 
we showed that 0(j(O)) = j(a) for some proper fractional O-ideal a (see 
(10.26)). Hence every root of Ho(X) is also a root of Ne, 4 — j(aj)), and 
since both polynomials are monic of degree h, they must be equal. Q.E.D. 


An important consequence of this proposition is that Hgy(X) is the min- 
imal polynomial of j(a), where a is any proper fractional O-ideal. 

The algorithm we will present for computing Ho(X) uses the theory 
of complex multiplication, and in particular, the polynomial @,,(X,X ) ob- 
tained by setting X = Y in the modular equation plays an important role. 
The reason for this is the following observation: 


Lemma 13.3. Let m> 1. If O has a primitive element of norm m, then the 
class equation Ho(X) is an irreducible factor of ®,(X,X). Furthermore, 
every irreducible factor of ®»(X,X) arises in this way. 
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Proof. Let a € O be a primitive element of norm m. Corollary 11.27 tells 
us that a@O C O is a cyclic sublattice of index m, and it follows that 


0 = On(7(aO), j(O)) = Pm); I(P)). 


Thus j(Q) is a root of ®»(X,X), which implies that its minimal polynomial 
Ho(X) is a factor of ®m(X,X). 

To show that every irreducible factor of ®,,(X,X) is a class equation, 
suppose that ©®,,(8,8) =0. Then Theorem 11.23 implies that 6 = j(L)= 
j(L’), where L' C L is a cyclic sublattice of index m. By Theorem 10.9, 
L’' = aL for some complex number a, and then a is primitive of norm m 
by Corollary 11.27. Thus a ¢ Z, so that L has complex multiplication by a. 
By Theorem 10.14, this means that up to homothety, L is a proper frac- 
tional O-ideal for some order O in an imaginary quadratic field. Then 6 = 
jJ(L) has Ho(X) as its minimal polynomial, and hence Ho(X) is the corre- 
sponding irreducible factor of ®,,(X,X). This proves the lemma. Q.E.D. 


The next question is, what power of Ho(X) appears in the factorization 
of ®,(X,X)? The answer involves the number r(O,m), which is defined 
as follows: given an order O in an imaginary quadratic field and a positive 
integer m, set 


r(O,m) = |{a€ O: a is primitive, N(a) = m}/O*|, 


where the units O* act by sending a to €a for € € O*. It is easy to see that 
r(O, m) is finite, and for a given m, there are only finitely many orders with 
r(O, m) > 0 (see Exercise 13.2). Then the following theorem tells us how to 
factor ®»,(X,X): 


Theorem 13.4. If m > 1, there is a constant Cm € C* such that 


On(X,X) = Cm [| Ho (XYO™. 
oO 


Proof. Fix an order O, and pick a number 7 in the upper half plane such 
that O = [1,7]. To prove the theorem, it suffices to show that j(O) = j (7) 
is a root of ®,,(X,X) of multiplicity r(O,m). 

We begin by studying the multiplicity of j(7) as a root of ®,,(X, j(7)). 
Using the standard factorization 


On(X,j(7))= [] (X-i@7)), 
aEC(m) 
we see that 


®m(X,j(7)) =(X —j(m)) J] (*-i@n)), 


j(aT0) A(T) 
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where 


(13.5) r= |{o € C(m): j(o7) = j(70)}I- 


Thus j(7) is a root of multiplicity r of ®,(X, j(70)). 

We will next show that the number r of (13.5) is the multiplicity of j(79) 
as a root of ®»(X,X). To see what’s involved, suppose that we have a 
polynomial F(X ,Y) and a number Xp such that F(Xo, Xo) = 0. Then Xo is 
a root of both F(X,X) and F(X, Xo), but in general, the multiplicities of 
these roots are different (see Exercise 13.3 for an example). So it will take 
a special argument to show that j(7)) has the same multiplicity for both 
On(X,X) and ®»(X, j(7)). The basic idea is to show that 


Om(Uu,u) 
lia. = 
u—j(to) Pm(U, j(To)) 


is nonzero, which will force the multiplicities to be equal (see Exercise 
13.3). To study this limit, note that 


_ — Om(Uyu) _ Bm(i(7), 57) 
lees) Ba(Uy 0) ta BUTI) 
7 i(r)- i(o7) 
rt ig) IT) ~ I(T) 


It suffices to compute the limit of each factor individually, and note that 
if j(T0) # j(77), then the limit of the corresponding factor is 1. Thus it 
remains to study the limit 


(13.6) im JO— JT) 
rt j(T) — j(970) 
when o € C(m) satisfies j(70) = j(70). 

The equality j(7) = j(079) implies that there is some y € SL(2,Z) such 
that 07) = 7. If we set 6 = y—1o, then @ fixes 7). Note also that det(@) = 
m and that the entries of & are relatively prime. Using @, the limit (13.6) 
can be written _ 

rr eat AC 
tT J(T)— J(T) 
Consider the Taylor expansion of j(7) about 7 = 7: 
i(t) = jo) +ak(T-T) +--+, ae F 0. 
Substituting 67 for 7, we get the series 


j(GT) = j(to) + a4(6T —T)¥ ++, 
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and then one computes that 


J(T) — jt) ay(T —T +--- 
= OT —To 
T—T0 
Since 
OT — To OT 0 7 
— &'(T7) 
TT. T —TO T—T%] T— 


it follows that the limit (13.6) equals 1— 6'(79)*, and thus we need to prove 
that 
(13.7) 6'(m™)* #1, 


where k is the order of vanishing of j(7) — j(7o) at To. 
If we write 6 = (¢ 2 , then an easy computation shows that 


~) = m 
a(t) = (ct +d)?’ 


Note also that c # 0 since @ fixes 7 (see Exercise 13.4). Now suppose that 
J(™o) # 0 or 1728. Then, by part (iv) of Theorem 11.2, it follows that k = 1, 
so that (13.7) reduces to 


(rap? ” 
which is obvious since c 4 0 and 79 is not a real number. When j(70) = 


1728, we can assume that 7 = 1 (recall that j(¢) = 1728), and then Theorem 
11.2 tells us that k = 2. Thus if (13.7) failed to hold, we would have 


ee +d 


m2 


(ci + d)4 


which implies that c = +,/m and d = 0 (see Exercise 13.4). Then G(i) =i 
tells us that a = 0 and b = +\/m. So either & doesn’t have integer entries 
(when m is not a perfect square), or the entries are integers with a common 
divisor (since m > 1). Both cases contradict what we know about @, so that 
(13.7) holds when j(7) = 1728. The case when j(7) = 0 is similar and is 
left to the reader (see Exercise 13.4). 

We should mention that the standard treatment of (13.6) in the litera- 
ture (see Deuring [24, 812] or Lang [73, Appendix to §10]) seems to be 
incomplete. 

We have thus shown that the multiplicity of j(79) as a root of ®,(X,X) 
is 


= 1, 


r =|{o € C(m): j(o7) = j(7)}|, 
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and it remains to show that r = r(O,m), where 
r(O,m) = |{a € O: a is primitive, N(a) = m}/O*|. 


To prove the desired equality, we will construct a map a++oa. Namely, 
if a € O is primitive of norm m, then by Corollary 11.27, aO is a cyclic 
sublattice of O of index m, and since O = [1,7], Lemma 11.24 implies 
that there is a unique o = (eo) € C(m) such that aO = d[1,079]. Then o 
satisfies j(07)) = j(7), and note also that if €€ O*, then €a@ maps to the 
same o that a does. Thus we have constructed a well-defined map 


{a€O:aq is primitive and N(a) = m}/O* 
—+ {0 € C(m): j (07%) = j(T)}- 


This map is easily seen to be bijective (see Exercise 13.5), which proves that 
r =r(O,m). This completes the proof of Theorem 13.4. Q.E.D. 


Besides knowing the factorization of ®,,(X,X), its degree is easy to 
compute: 


Proposition 13.8. Jf m > 1, then the degree of ®,(X,X) is 


a 
2 a peda lay ew 
a> //m 
where ¢ is the Euler $-function and $(,/m) =0 when m is not a perfect 
square. 


Proof. The proof of this proposition is given in Exercise 13.6. Q.E.D. 


If we write r(O,m) as r(D,m), where D is the discriminant of O, 
then Proposition 13.8 and Theorem 13.4 allow us to express the degree of 
®n(X,X) in two ways. This gives us the following corollary, which is one 
of Kronecker’s class number relations: 


Corollary 13.9. [f m > 1, then 


DoD, myh(D) = 2S) aa mig MEOH /a)) + HV). 
D a|m , 


a>J/m 
Q.E.D. 


To illustrate the above theorems, let’s study the case m = 3. There are 
only four orders with primitive elements of norm 3, namely Z[w], Z[/—3], 
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Z[V—2] and Z[(1 + V—11)/2], and the corresponding r(D,3)’s are 1, 1, 2 
and 2 respectively (see Exercise 13.7). Then Theorem 13.4 tells us that 


(13.10) ®3(X,X) = +H_3(X)H_12(X)H_g(XPH_u(X)*, 


and since 3(X,X) has degree 6 by Proposition 13.8, we get the following 
class number relation: 


6 = h(—3) + A(—12) + 2h(—8) + 2h(-11). 


This equation implies that all four class numbers must be one. 
We can work out (13.10) more explicitly, for we know ®3(X,Y) from 
(11.22). Setting X = Y gives us 


3(X,X) = —X° + 4464.X° + 2585778176 X 4 + 17800519680000 X 3 
— 769939996672000000.X * + 3710851743744000000000, 
and factoring this over Q, we obtain 
}3(X,X) = —X(X — 54000)(X — 8000)?(X + 32768). 


However, in §§10 and 12, we computed the j-invariants j((1 + /—3)/2) = 
0, J(V—2) = 8000 and j((1 + V—11)/2) = —32768. Thus we recognize three 
of the above four factors, and it follows that the fourth must be H_12(X), 
1.€., 

H_42(X) = X — j(V—-3) = X — 54000. 


This proves that j(/—3) = 54000. 

Let’s now turn to the general problem of computing a given class equa- 
tion Hp(X). Since ®,,(X,X) will have many factors, we need to know 
which one is the particular Hp(X ) we’re interested in. The basic idea is to 
use multiplicities to distinguish the factors we seek. In particular, the fac- 
tors of multiplicity one play an especially important role. Let’s define the 
polynomial 

Smi(X,X)= T] Hp(X). 
r(D,m)=1 


By Theorem 13.4, we know that ®,,1(X,X) is the product of the multiplic- 
ity one factors of ®,,(X,X). We can describe ®,,1(X,X) as follows: 
Proposition 13.11. Jf m > 1, then ®»1(X,X) equals 


H_4(X)H_8(X), ifm =2 
H_m(X)H_am(X), if m=3 mod 4 and m#3k*, k>1 
H_an(X), if m>2, m#3 mod 4 or m= 3k, k > 1. 
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Proof. Let’s first show that the Hp(X)’s listed above are factors of mul- 
tiplicity one of ®,,(X,X). Since +,/—™m are the only primitive norm m 
elements of Z[,/—m], it follows that H_4,(X) is a factor of multiplicity 1. 
When m = 2, the elements of norm 2 in Z[1] are +1 +1, which are all asso- 
ciate under Z[1]*. Thus H_4(X) is also a factor of multiplicity one. Finally, 
when m = 3 mod 4 and m # 3k”, k > 1, we need to consider the multiplic- 
ity of H_»(X). The order Z[(1 + /—m)/2] has at least two primitive norm 
m elements, namely +./—m. To see if there are any others, suppose that 
a+b(1i+/—m)/2 is also primitive of norm m. Then b # 0 and, taking 
norms, 
4m = (2a + b)? + mb’. 

Thus 5 = +1 or +2, and b= +2 leads to the solutions we already know. 
So what happens if b = +1? This clearly implies 3m = (2a + b)*, so that 
m = 3k”, and since k > 1 is excluded by hypothesis, we see that m = 3. 
Here, b = +1 leads to 4 more solutions, but since |Z[C, ]*| = 6, we still get 
a multiplicity one factor. 

The next step is to show that these are the only factors of multiplicity 
one. So suppose that r(O,m) = 1 for some order ©. For simplicity, let’s 
also assume that O* = {+1}. Given a € O primitive of norm m, note that 
+a and +@ are also primitive of the same norm. Then r(O,m) = 1 implies 
that a = +a. But @ = a is easily seen to be impossible (a is primitive and 
m > 1), so that @ = —a. This means that a is a rational multiple of VD, 
where D is the discriminant of O. The argument now breaks up into two 
cases. 

If D=0mod 4, then O =[1,/D/2], so that a, being primitive, must 
be +VD/2. This implies that m = N(a) = —D/4, hence D = —4m. The 
corresponding factor is thus H_4,(X), which is one of the ones we know. 

If D=1mod 4, then O = [1,(1+ VD)/2], so that a = a + b(1 + VD)/2. 
Since a is a multiple of VD, we have 2a + b =0, and since a and b are 
relatively prime (a is primitive), we have b= +2. This means that a= 
+/D, so that m = N(a) = —D. Thus D = —™m, and this will be the other 
case we know once we prove that m 4 3k”, k > 1. So suppose that m has 
this form. Then D = —3k”, which means that © is the order of conductor 
k in Z[¢,]. Note that O* = {+1} since k > 1. One easily computes that 
+k/—3 and +k(1—(,) are primitive elements of O of norm 3k? = m, 
which contradicts our assumption that r(O, m) = 1. 

It remains to consider the case when O* # {+1}. We leave it to the 
reader to check that when O = Z[¢,] (resp. O = Z[t]), r(O,m) = 1 implies 
that m = 3 (resp. m = 2) (see Exercise 13.8). This completes the proof of 
Proposition 13.11. Q.E.D. 


It is now fairly easy to compute Hp(X) using the $,,(X,X)’s. In the 
discussion that follows, m will denote a positive integer, and for simplicity 
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we will assume m> 3. It turns out that there are three cases to con- 
sider. 
If m #3 mod 4 or m = 3k?, then Proposition 13.11 tells us that 


H_4m(X) = Omi(X,X), 


so that once we factor ®,,(X,X) into irreducibles, we know H_4,(X). 
Next, if m=3mod8 and m#3k*, then Proposition 13.11 tells us 
that 


(13.12) H_m(X)H_am(X) = Om1(X,X). 


However, since m > 3 and m = 3 mod 8, it follows from Corollary 7.28 that 
h(—4m) = 3h(—m), so that H_4,(X) has greater degree than H_,,(X). 
Thus, factoring ®,,(X,X) determines both H_»,(X) and H_4,(X). 

Finally, if m = 7 mod 8, then (13.12) still holds, but this time more work 
is needed since H_,»,(X) and H_4,(X) have the same degree by Corollary 
7.28. We claim that 


(13.13) H_m(X) = ged(®ma(X,X), ®oms/a(X, X)). 


To see this, first note that H_m(X) divides ® im 41)/4(X, X ) since in the or- 
der of discriminant —m, (1+ ./—m)/2 is primitive of norm (m+ 1)/4. If 
we turn to the order of discriminant —4m, there are no primitive elements 
of norm (m+ 1)/4 (see Exercise 13.9), and (13.13) follows. Thus, to de- 
termine H_»(X) and H_4,(X), we need to factor both @n(X,X) and 
P(m41)/4(X, X ) into irreducibles. 

Using the above process, it is now easy to compute any Hp(X), as- 
suming that we know the requisite modular equation (or equations). Some 
simple examples are given in Exercise 13.10. 


B. Computing the Modular Equation 


To complete our algorithm for finding the class equation, we need to know 
how to compute the modular equation ®,(X,Y) = 0. This turns out to 
be the weak link in our theory, for while such an algorithm exists, it is so 
cumbersome that it can be implemented only for very small m. 

The first step in computing ®,,(X,Y) is to reduce to the case when m is 
prime. This is done by means of the following proposition: 


Proposition 13.14. Let m> 1 be an integer, and set V(m) = mT yim(1 + 
1/p), which is the degree of ®»,(X ,Y) as a polynomial in X. 
(i) If m = m,m2, where m, and mz are relatively prime, then 

(m2) 


Om(X,Y)= [] Omn,(X,&), 
i=1 
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where X = &; are the roots of ®n,(X,Y) = 0. 
(ii) If m = p*, where p is prime and a > 1, then 


wy a—1 
HW? ®p(X, &) 


haaye oe" 
On(X,Y) = 
ep ,(X, &) = s 
(X —Y)etl ° 
where X = €; are the roots of ®,.-1(X,Y) = 0. 
Proof. See Weber [102, §69]. Q.E.D. 


Now let p be a prime. To compute @,(X,Y), we will follow Kaltofen 
and Yui [66] and Yui [110]. First note by parts (iii) and (v) of Theorem 
11.18, we have 


&,(X,Y) = 6,(¥,X) 
&,(X,Y)=(X? —Y)(X —Y?) mod pZ[X,Y], 


and we also know that ,(X,Y) is monic of degree UV(p)=p+1 as a 
polynomial in X . Thus we can write @,(X,Y) in the following form: 


(13.15) 
(X?-Y\(X-Y?)+p S> cuX'¥i+p SY > ci(X'¥! + x!¥'), 
O0<i<p 0<i<j<p 
where the coefficients c;;’s are integers. We will use the g-expansion of the 
j-function to obtain a finite system of equations that can be solved uniquely 


for the c;;’s. 
By the definition of the modular equation, we have the identity 


®p(j(PT), j(T)) = 9. 


Substituting the g-expansions for j(7) and j(pT) into this equation and us- 
ing (13.15), we obtain 


(13.16) 
0=(i(pT)? — i(7))G(PT) - J(7)”) 
+p >) cui(ptTyiy +p Yd) ciety iy + ity icy). 


0<i<p 0<i<j<p 
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If we equate the coefficients of the different powers of q, we get an infinite 
number of linear equations in the variables c;;. We can reduce to a finite 
number of equations as follows: 


Proposition 13.17. The finite system of linear equations obtained by equating 
coefficients of nonpositive powers of q in (13.16) has a unique solution given 
by the coefficients cj; of the modular equation. 


Proof. Since the modular equation provides one solution, it suffices to 
prove uniqueness. Using (13.15), a solution of these equations gives a poly- 
nomial F(X ,Y) with the following three properties: 

(i) F(X,Y) is monic of degree p+1in X. 

(ii) F(X,Y) = F(Y,X). 
(iii) limtmr—oo F(j(PT), J(7)) = 9. 
To explain the last property, note that the g-expansion of F(j(pr), 
J(T)) contains no nonpositive powers of g since F(X,Y) comes from a 
solution of our finite system of equations. Since g— 0 as Im7 — oo, (iii) 
follows. 

We claim these properties force F(X,Y) = ®,(X,Y), which will prove 
uniqueness. The idea is to study F(j(pr), j(7)), which is a modular function 
for Io(p). We will first show that F(j(pt), j(7)) vanishes at the cusps, which 
means that 


(13.18) ; lim F(j(pyt),j(y7))=9 for all y E SLQ,Z). 
mT-—0o 
Using (11.12), this is equivalent to showing 
jim F(i(7),i(7))=9 for all o € C(p). 
mT- oo 
When o = (4 :) , we’re done by (iii), and when o # (4 °) , o must be of the 


form Ge) since p is prime. If we set u = 07 = (7 +1)/p, then 7 = pu-1, 
and 


slit FUT) I) = tim FGM) (pe i) 
= lim FU) ie) 
= lim FU(p4) i) = 9, 


where we used (ii) and (iii) above. This proves (13.18). 

Thus F(j(pt), j(7)) is a holomorphic modular function for Ig(p) which 
vanishes at the cusps. In the case of modular functions for SL(2,Z), we 
proved in Lemma 11.10 that such a function is zero, and the proof ex- 
tends easily to the case of Ig(p) (see Exercise 13.11). This shows that 
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F(j(pt), j(T))= 90, so that j(pr) is a root of F(X,j(7)) and ®p(X, j(7)). 
Since the latter is irreducible over C(j(7)), it must divide F(X, j(7)). Both 
F(X,Y) and ®,(X,Y) are monic of the same degree, and hence they must 
be equal. Q.E.D. 


Looking at the q-expansions for j(7) and j(pT), the most negative power 
of q in (13.16) is qu? -?, and it follows that the system of equations de- 
scribed in Proposition 13.17 has p* + p+1 equations in the (p*+3p+ 
2)/2 unknowns c;;. With some cleverness, one can reduce to p* + p equa- 
tions in (p* + 3p)/2 unknowns (see Yui [110]). These equations have been 
written down explicitly by Yui [110], though the resulting expressions are 
extremely complicated. For a discussion of the computational aspects of 
these equations, see Kaltofen and Yui [66]. 

We are not quite done, for our equations for ®,(X,Y) involve the q- 
expansions of j(7) and j(p7). Hence we need to calculate those coefficients 
of the q-expansions which contribute to negative powers of q in (13.16). It 
suffices to do this for j(T), and because the most negative power of q 
in (13.16) is q-P-?, we need only the first p? + p coefficients of the q- 
expansion of the j-function. In §21 we found some nice formulas for 


3 
jer) = 8, 


but to get the q-expansion, we need series expansions of the numerator and 
denominator. For g2(7), we use the classical formula 


where 03(1) = Soain d? (see Lang [73, §4.1] or Serre [88, §VII.4.2]), and for 
A(T), we know from Theorem 12.17 that 


A(r) = (2m)q [[G-4")™. 


n=1 


This is still not a series, but if we use Euler’s famous identity 


Il (J = q”) = 3 gnens Die 
n=1 


n=—oo 


(see Hardy and Wright [48, §19.9]), then it becomes straightforward to write 
a program to compute the qg-expansion of j(7). A description of how to do 
this is in Hermann [53] (he also gives an alternate approach to calculating 
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the modular equation), and one finds that the first few terms of the q- 
expansion are 


i(t)= ; + 744 + 196884q + 21493760q* + 864299970 q? 


+ 20245856256q‘ + 333202640600q° +---. 


These formulas also give a second proof that the g-expansion of j(7) has 
integer coefficients (see Exercise 13.12). 

The conclusion of this rather long discussion is that for any integer m > 
0, we can compute ®,,(X,Y), which then gives us ®,,(X,X) by setting 
X =Y. There are known algorithms for factoring ®,,(X,X) into irreduc- 
ibles, and then the discussion following Proposition 13.11 shows how to 
compute Ho(X). We have thus proved the following theorem: 


Theorem 13.19. Given an order O in an imaginary quadratic field, there is 
an algorithm for computing the class equation Ho(X ). Q.E.D. 


The problem with this theorem is that our algorithm for computing 
Ho(X ) requires knowing ®,,(X,Y). Modular equations are extremely com- 
plicated polynomials and are difficult to compute. We saw in (11.22) that 
©3(X,Y) is very large, and things get worse as m increases. For example, 
the printout of 61;(X,Y) takes over two single-spaced pages, and some 
of the coefficients have over 120 digits (see Kaltofen and Yui [66]). In 
general, Cohen [18] proved that the maximum of the absolute values of 
the coefficients of 6,,(X,Y) is asymptotic to exp(6V(m)log(m)), where 
v(m) = m]],),(1 + 1/p), so that the growth is exponential in m. Hence 
the above algorithm is not a practical way to compute class equations. 

Recently, a more efficient approach to computing Hp(X ) has been de- 
veloped by Kaltofen and Yui [65]. The basic idea is to compute Hp(X) 
directly from the formula 


h 
Hp(X) = [ [(X - j(ai)). 


i=1 


We know how to find the h = h(D) reduced forms of discriminant D, and 
then the a;’s can be taken to be the proper O-ideals corresponding to the 
reduced forms via Theorem 7.7. Since Hp(X) has integral coefficients, we 
need only compute j(a;) numerically to a sufficiently high degree of preci- 
sion, and the formulas for j(7) given in §12 are ideal for this purpose. For 
an example of how this works, consider the case of discriminant D = —71. 
Here, the class number is h(—71)= 7, and the above process shows that 
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the minimal polynomial of j((1 + V—71)/2) is 
H_7(X) = X'4+5-7-31- 127-233-9769. X° 

—2-5-+7-44171287694351.X°> 
+ 2-3-7-2342715209763043144031X* 

(13.20) —3-7-31- 1265029590537220860166039.X > 
+2-7-113- 67-229: 17974026192471785192633 X* 
—7-11°- 17° - 14209133309796 18293 X 
+ (113-17 -23- 41-47-53) 


(This example was taken from the preliminary version of [65]—all primes 
< 1000 were factored out of the coefficients.) Note that the constant term 
is a cube, as predicted by Theorem 12.2. 

We can apply the algorithm of Theorem 13.19 to give a constructive ver- 
sion of Theorem 9.2, but before we do this, we need to learn about some 
of the recent work of Deuring, Gross and Zagier on the class equation. 


C. Theorems of Deuring, Gross and Zagier 


In 1946 Deuring [25] proved a remarkable result concerning prime divisors 
of the difference of two singular moduli. To state Deuring’s theorem pre- 
cisely, let O,; and ©2 be orders in imaginary quadratic fields K; and K2 
respectively, and for i = 1, 2, let aj be a proper fractional Oj-ideal. Then 
we have: 


Theorem 13.21. Let L be a number field containing j(a1) and j(a2), and let 
3B be a prime of L lying over the prime number p. When K, = K2, assume 
in addition that p divides neither of the conductors of O, and O2. If j(a1) # 
J(a2), then 


; p splits completely 
J(a1) = j(a2) mod P > 


in neither K, or K>. 


Proof. The proof uses reduction theory of elliptic curves. See Deuring [25] 
or Lang [73, §13.4]. Q.E.D. 


We can use this theorem to study the constant term and discriminant of 
the class equation: 


Corollary 13.22. Let D < 0 be a discriminant, and let p be prime. 


(i) If p divides the constant term of Hp(X) and Q(VD) # Q(V—3), then 
(D/p)# land either p =3 or p=2 mod 3. 
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(ii) If p divides the discriminant of Hp(X), then (D/p) # 1. 


Proof. Let aj,...,a,, h = h(D), be ideal class representatives for the order 
of discriminant D. To prove (i), note that the constant term of the class 
equation is 


h 
C=+]]J(ai). 
int 


If p | C, then in some number field L, there is a prime 8 containing p that 
divides some j(a;). Since 


i(ai) = j(ai) — 0 = j(ai) — j(1 + V—3)/2), 


we know by Theorem 13.21 that p splits in neither Q(VD) nor Q(V—3), 
and (i) follows immediately. 
To prove (ii), note that the discriminant of Hp(X) is 


disc(Hp(X)) = [[(i(ai) — i(a)))’. 
i<j 
Thus, if p | disc(Hp(X)), then some 8 lying over p divides some j(a;) — 
j(a;). If pY D, then Theorem 13.21 implies that p doesn’t split in Q(VD), 
and (D/p) = —1 follows. If p| D, then (D/p)=0, so that (D/p) #1 in 
either case. Q.E.D. 


One of our original motivations for studying complex multiplication 
came from the question of when a prime can be written in the form x? + 
ny”. Using the class equation, we can now prove a constructive version of 
our basic result, Theorem 9.2: 


Theorem 13.23. Let n be a positive integer. Then there is a monic irreducible 
polynomial f,(X) of degree h(—4n) such that for an odd prime p not dividing 
n, 

(—n/p)=1and f,(X)=0 mod p 


pHxtny* <> 
has an integer solution. 


Furthermore, there is an algorithm for finding f,(X). 


Proof. The order of discriminant —4n is O = [1, /—n], so that by Theo- 
rem 11.1, j(,/—n) is a real algebraic integer and is a primitive element 
of the ring class field of O. Since H_4,(X) is the minimal polynomial of 
j(/—n), we can set f,(X) = H_4,(X) in Theorem 9.2, and then the de- 
sired equivalence holds for primes dividing neither —4n nor the discrim- 
inant of H_4,(X). But when a prime divides the discriminant, Corollary 
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13.22 tells us that (—4n/p) #1. Since both sides of the desired equiva- 
lence imply (—”/p) = 1, the discriminant condition is superfluous. Finally, 
by Theorem 13.19, there is an algorithm for finding H_4,(X), and the the- 
orem is proved. Q.E.D. 


From a computational point of view, this result is not ideal. The polyno- 
mials H_4,(X) are difficult to compute, and as indicated by H_s6(X) and 
H_7,(X) (see (13.1) and (13.20)), they are excessively complicated. The real 
value of Theorem 13.23 is the way it links the ideas of class field theory and 
complex multiplication to the elementary question of when a prime can be 
written in the form x? + ny?. 

Deuring’s study of j(a;) — j(a2) has prompted some recent work of Gross 
and Zagier [46] which determines exactly which primes divide such a dif- 
ference. Their results apply only to field discriminants, but one gets very 
complete information in this case. Let d, and d2 be the discriminants of 
imaginary quadratic fields K; and K2 respectively. We will assume that d, 
and d2 are relatively prime. Then set 


hy hy 4/wiw2 
J(d1,d2) = (i [[G@- 1) ; 


i=1j=1 


where qj,...,@,, are ideal class representatives of Ox,, 61,...,6,, are ideal 
class representatives of Ox,, and w; = Ox, W2 = lO, Note that J(d;, 
dz) is an integer when dj,d2 < —4, and that J(d,,d2)* is always an integer 
(see Exercise 13.13). 

‘To state Gross and Zagier’s formula for J(d1, d2y , we will need functions 
€(n) and F(m), which are defined as follows. First, if p is a prime, we set 


vie (di/p) if pd, 
(d2/p) if pf de. 


The reader can check that this is well-defined whenever (d1d2/p) # —1 (see 
Exercise 13.14). Then, if = ]];_, pj’, we set 


e(n) = [J ny", 
i=1 


where we assume that (d,d2/pi) # —1 for allz. Finally, F(m) is defined by 


the formula 
F(m)= [J 2™. 


un =m 
nn’ >0 


This is well-defined when all primes p dividing m satisfy (d,;d2/p) # —1. 
We can now state the main theorem of Gross and Zagier [46]: 
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Theorem 13.24. With the above notation, 


didz—x? 
J(ajayr=+ JI F mn.) 
x2 <dyd> 
x2=d 1d mod4 


Proof. First note that F((d,d2 —x*)/4) is always defined since any prime 
p dividing (did2 — x*)/4 satisfies (d,d2/p) # —1 (see Exercise 13.14). The 
paper [46] contains two proofs of this theorem, one algebraic and one an- 
alytic. The algebraic proof, which uses reduction theory of elliptic curves, 
is given only for the case of prime discriminants. A general version of this 
proof appears in Dorman [30]. Q.E.D. 


This theorem gives the following corollary: 


Corollary 13.25. Let p be a prime dividing J(d,,d)*. Then: 
(i) (di/p) # 1 and (da/p) # 1. 

(ii) p divides a positive integer of the form (dd2 — x*)/4. 

(iii) p< did>/4. 


Proof. If p divides J(d;,d2), it must divide some F((djd2 — x)/4), and 
the formula for F(m) then shows that p divides (d,dz — x”)/4. This easily 
implies parts (ii) and (iii) of the corollary. 

It remains to prove part (i). We will first consider the following lemma 
which tells us how to compute F(m): 


Lemma 13.26. Let m be a positive integer of the form (d,dz — x*)/4. Then 
F(m) = 1 unless m can be written in the form 


m = path pitt... pregr---gs', 
where €(p) = €(p1) = -:: = €(p,)=—1 and €(q1) =--- =€(qs) =1. In this 


case, 
F(m) = p? +1)(b; +1)---(bs +1) 


In particular, p| F(m) means that p is the only prime dividing m with an 
odd exponent and €(p) = -1. 


Proof. See Exercises 13.15 and 13.16. Q.E.D. 


We can now complete the proof of Corollary 13.25. The above lemma 
shows that €(p) = —1 for any prime p dividing F(m). It is easy to see that 
€(p) = —1 implies (d,/p) # 1 and (d2/p) # 1 (see Exercise 13.14), and the 
corollary is proved. Q.E.D. 
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Note that this corollary implies Deuring’s theorem in the case of rela- 
tively prime field discriminants. We should also mention that when d;d2 = 
1 mod 8, one gets better upper bounds on p (see Exercise 13.17). 

If we apply Corollary 13.25 when dz = —3, then we can strengthen Deur- 
ing’s result about the constant term of the class equation: 


Corollary 13.27. Let dx be the discriminant of an imaginary quadratic field 
K, and assume that 3/dx. If p is a prime dividing the constant term of 
Hy,(X), then (dx/p) #1 and either p =3 or p=2mod p. Furthermore, 
Pp S3|dk|/4. 


Proof. If a1,...,a, are ideal class representatives of Ox, then 


ji 4/3w 
I(dx,—-3) = (Lie) ) 
i=1 


where w = |Oz%|. Thus the primes dividing J(dx,—3) are the same as the 
primes dividing the constant term of Hz,(X), and we are done by the 
previous corollary. Q.E.D. 


For an example of how good these estimates are, consider H_s55(X ). We 
know from (13.1) that the constant term is 


(28. 117 -17- 41). 


Corollary 13.27 gives us the estimate p < 3| — 56|/4 = 42, which is as good 
as one can get. The reader should also check the constant term of H_7,(X) 
given in (13.20)—the estimate is again as good as possible. Of course, one 
could use Theorem 13.24 to compute these constant terms directly (see 
Exercise 13.18). 

Gross and Zagier also have similar theorems for primes dividing the dis- 
criminant of the class equation. Rather than give the formula for the mul- 
tiplicities of the primes, we will just state the following corollary of their 
result: 


Theorem 13.28. Let dx be the discriminant of an imaginary quadratic field 
K, and let p be a prime dividing the discriminant of Ha,(X). Then (dx/p) 
# land p< |dg|. 


Proof. In the case of prime discriminants, this is proved by Gross and Za- 
gier in [46], and the general case is in Dorman [29]. Q.E.D. 


This theorem strengthens Deuring’s result about the discriminant of the 
class equation. For an example of the bound p < |dx|, consider H_56(X ). 
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One computes that its discriminant is 


—2N6 , 713, 4410. 47°. 294 . 342 . 37% . 412 . 432. 47? . 532. 


Theorem 13.28 gives the bound p < 56 on the primes that can appear, 
which again is the best possible. 


D. Exercises 


13.1. 
13.2. 


13.3. 


13.4. 


13.5. 


Use Corollary 11.37 to prove Proposition 13.2. 


If O is an order in an imaginary quadratic field and m is a positive 

integer, then we define r(O,m) = |{a € O: a is primitive and N(a) 

= m}/O*|, where O* acts by multiplication. 

(a) Prove that r(O,m) is finite. 

(b) For fixed m, prove that there are only finitely many orders O 
such that r(O,m) > 0. 


Let F(X,Y)€ C[X,Y], and suppose that F(Xo, Xo) = 0. Then Xo is 

a root of both F(X, Xo) and F(X, X). 

(a) If F(X, Y)=X3+Y°+4XY, then show that 0 is a root of 
F(X,0) and F(X,X) of different multiplicities. Note that the 
polynomial F(X ,Y) is symmetric. 

(b) If F(X,Y) and Xp satisfy the additional condition that 

F(X,X) 
in ——— 
XX F(X, Xo) 


exists and is nonzero, then show that Xo is a root of F(X, Xo) 
and F(X,X) of the same multiplicity. 


This exercise is concerned with the proof of (13.7). Recall that ¢(7) 
= 7), where 6 = (2 *) has relatively prime entries and determinant 
m> 1. 

(a) Prove that c # 0. 

(b) When j(79) = 1728, we can assume 7) =i. Show that m? = 
(ci +d)‘ implies c = +,/m and d = 0. Since 6(i) = i, conclude 
that a = 0 and b = +./m, and derive a contradiction. 

(c) When j(70) = 0, argue as in (b) to complete the proof of (13.7). 

Let m > 1, and let O = [1,79] be an order in an imaginary quadratic 

field. Consider the sets 

A ={a€O:a is primitive and N(a) = m}/O* 


B= {a € C(m): j(o79) = j(70)}. 
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13.6. 


13.7. 


13.8. 
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In the proof of Theorem 13.4, we showed how an element [a] € A 

determines a unique o € B. Prove that the map [a] o defines a 

bijection A > B. 

The goal of this exercise is to prove the formula for the degree N of 

®,,(X,X) given in Proposition 13.8. 

(a) Prove that q~% is the most negative power of q in the q-expan- 
sion of ®,,(j(7T), j(7)). 

(b) If ¢ = (22) € C(m), then use (11.19) to show that the q-expan- 
sion of j(7)— j(0T) is 


qu) —cabg-ald 4... when a <d 
=(g-4/4 + q 2 ee whena>d 
(1—¢%)\g-1 +... whena =d, 


where (¢,, = e27/™ . The last possibility can occur only when m 
is a perfect square, and in this case, (2? # 1 since o € C(m). 


(c) Given a, we know that d = m/a. In part (a) of Exercise 11.9 we 
showed that the number of possible o € C(m) with this a and d 
was 


“#e) 


where e = gcd(a,d). Use this formula and (b) to show that the 
degree N of ®,,(X,X) equals 


 fHey+ DO 4. Soe) + ove, 


a|m a|m 


a<V/m a>J/m 


(d) Show that the first two sums in the above expression are equal. 
This proves the formula for N given in Proposition 13.8. 


This exercise is concerned with some examples of Theorem 13.4. 

(a) Verify that r(—3,3) = r(—12) = 1, r(—8,3) = r(—11,3) = 2, and 
also show that r(D,3)=0 for all other discriminants. This 
proves that 


$3(X,X) = +H_3(X)H_12(X)H_9(XH_1 (XY. 
(b) Use the method of (a) to write down the factorization of 
@5(X,X). 


The proof of Proposition 13.11 requires the following facts about the 
orders of discriminant —3 and —4 (Z[w] and Z[t] respectively). 


(a) If m> 1, show that r(—3,m) = 1 if and only if m = 3. 
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(b) If m > 1, show that r(—4,m) = 1 if and only if m = 2. 


13.9. Let m=3 mod 4 be an integer > 3. Show that the order Z[./—m] of 
discriminant —4m has no primitive elements of norm (m + 1)/4. 


13.10. In this exercise we will illustrate the algorithm given in the text for 
computing Hp(X). 
(a) Show that H_56(X) is determined by knowing 214(X,X). 
(b) Show that H_1;(X) and H_44(X) are determined by knowing 
O11(X,X). 
(c) Show that H_7(X) and H_23(X) are determined by knowing 
07(X,X) and @2(X,X). 
13.11. Let f(7) be a modular function for Ig(m) which vanishes at the 
cusps. 
(a) If yj, 1 =1,...,]C(m)| are coset representatives for Ip(m) Cc 
SL(2,Z), then show that 
IC(m)| 
I] fGi7) 
i=1 
is a modular function for SL(2,Z) which vanishes at infinity. 
(b) If in addition f(7) is holomorphic on , then show that f(r) is 
identically zero. Hint: use (a) and Lemma 11.10. 


13.12. Use the formulas 


A(r) = (2)"q |] (-4") 
n=1 


to show that the coefficients of the g-expansion of j(7) are integral. 
This is the classical method used to prove Theorem 11.8. 


13.13. Let J(d;,d2) be as defined in the text. 
(a) If di, dz < —4, then show that J(dj, dz) is an integer. Hint: use 
Galois theory. 
(b) Show that J(d,,d2)? is always an integer. Hint: when dj or d2 
is —3, recall that j((1+~/—3)/2) = 0. Theorem 12.2 will be 
useful. 


13.14. Let €(m) and F(n) be as defined in the text, and let p be a prime 
number. 


(a) Show that e(p) is defined whenever (d;d2/p) # —1. 


306 


13.15. 
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(b) If p divides a number of the form (d,d2 —x*)/4, then show 
that (did2/p) # —1. 
(c) Show that e(p) = —1 implies that (d,/p) #1 and (d2/p) # 1. 


Exercises 13.15 and 13.16 will prove Lemma 13.26. In this exercise 
we will show that any positive integer of the form m = (d,d2— 
x”)/4 satisfies «(m) = —1. We will need the following extension of 
the Legendre symbol. Let D = 0,1 mod 4, and let y:(Z/DZ)* — 
{+1} be the homomorphism from Lemma 1.14 (so that y([p]) = 
(D/p) when p is a prime not dividing D). Then for any integer m 
relatively prime to D, set 


(2) = x([)). 


(a) Show that (D/m) is multiplicative in D and m and depends 
only on the congrunce class of m modulo D. Also, when m = 
p++: p& is positive, show that 


Gm) Wr) 


where (D/p;) is the usual Kronecker symbol. Thus, when m 
is odd and positive, (D/m) is just the Jacobi symbol. Finally, 
show that (D/—1) = sgn(D). Hint: see Lemma 1.14. 


(b) We will need the following limited version of quadratic reci- 
procity for (D/m). Namely, if D = 1 mod 4 is relatively prime 
to m = 0,1 mod 4, then prove that (D/m) = (m/|D|). Further- 
more, if D and m have opposite signs, then prove that (D/m) 
= (m/D). 

(c) Let m be a positive integer such that e(m) is defined. If m is 
relatively prime to d,, then show that e(m) = (d1/m). 

(d) Now we can prove that €(m) = —1 when m = (d,d2 — x*)/4. 
We can assume d; =1mod 4, and write m=ab, where a| 
d,, a=1 mod 4 and gcd(di,b) = 1. Then d; = ad, where d= 
1 mod 4. 

(i) Show that €(m) = (d2/a)(d;/b). 

(ii) Show that (d;/b) = (a/dz)(d/—1). Hint: (d1/b) = (d1/4b) 
= (a/4b)(d/4b). Then use 4ab = d,d2 — x” and quadratic 
reciprocity. Remember that a and d have opposite signs 
and that a has no square factors. 

(iii) Use quadratic reciprocity to prove that €(m) = —1. Hint: 
remember that d2 < 0. 


13.16. 


13.17. 
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Let m be a positive integer such that e(m) = —1. The goal of this 
exercise is to compute F(m). We will use the function s(m) defined 
by 

s(m) = > e(m). 


nim 
n>0 


Note that s(m) is defined whenever e(mm) is. Given a prime p, let 

v,(m) be the highest power of p dividing m. 

(a) If m, and mz are relatively prime integers such that €(m,) and 
€(mz2) are defined, then prove that 


F(mm2) = F(m)°" F( m2)". 


(b) Suppose that m= pf'-- pag?! ---g®, where €(pj) = —1 and 
€(qi) = 1 for all 7. Prove that 


some a; is odd 


s(m) = 
m2 | [[ja1 (6 +1) all a;’s are even. 


(c) If e(m) = —1, show that there is at least one prime p with 
€(p) = —1 and v,(m) odd. Conclude that s(m) = 0. 

(d) Suppose that e(m) = —1, and that m is divisible by two primes 
p and q with e(p) = «(q)=—1 and v,(m) and v(m) odd. 
Prove that F(m) = 1. Hint: write m = p*#+1q?+1m', and use 
(a)(c). 

(e) Finally, suppose that m is divisible by a unique prime p with 
€(p) = 1 and v,(m) odd. Then m can be written m = p74t! pf 
as ar g?...gbs, where €(p) = e(pi) = —1 and e€(q;) = 1 for all 
i. Prove that 

F(m) = p@tD@rt1)-be 43), 


Hint: show that F(p***!) = p**!, and use (a)-(c). 
By (d) and (e), we see that when e(m) = —1, F(m) is computed 
by the formulas given in Lemma 13.26. Thus Lemma 13.26 is an 
immediate corollary of this exercise and the previous one. 


Let p be a prime dividing J(d,,d2)*. In Corollary 13.25, we showed 

that p < d,d2/4. In some cases, this estimate can be improved. 

(a) If djdz=1mod 8, then prove that p< djd2/8. Hint: use 
P| (didz— x*)/4. When p =2, note that did; = 1 mod 8 im- 
plies d,dz > 33. 

(b) If d; = dz =5 mod 8, then prove that p < d,d2/16. Hint: when 
p is odd, we have p | (did2z — x”)/8. To rule out the case 2p = 
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(d,dz — x*)/4, use Exercise 13.15 and Lemma 13.26. When p = 
2, see (a). 


13.18. Use Theorem 13.24 to compute the constant terms of H_s6(X ) and 
H_7(X), and compare your results with (13.1) and (13.20). Hint: 
use Lemma 13.26 to compute F(m). 
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The theory of complex multiplication has enabled us to prove some won- 
derful results, but our treatment is still far from complete. In particular, 
we need to acquaint the reader with the more modern form of the theory, 
where elliptic functions are replaced by elliptic curves. Thus, in this last 
section of the book, we will give some of the basic definitions and theo- 
rems concerning elliptic curves, and we will discuss complex multiplication 
and elliptic curves over finite fields. Then, to illustrate the power of what 
we've done, we will examine two recent primality tests that involve ellip- 
tic curves, one of which makes use of the class equation. Our treatment of 
these topics will not be self-contained, for our purpose is mostly to entice 
the reader into learning more about this lovely subject. Excellent introduc- 
tions to elliptic curves are available, notably the books by HusemGller [58], 
Koblitz [67] and Silverman [93], and more advanced topics are discussed in 
the books by Lang [73] and Shimura [90]. 


A. Elliptic Curves and Weierstrass Equations 


Given a field K of characteristic different from 2 or 3, an elliptic curve E 
over K is an equation of the form 


(14.1) 2 = 4x° — gox —g3, 


where 
g283€K and A=g}—2792 40. 


For reasons that will soon become clear, this equation is called the Weier- 
strass equation of E. When K has characteristic 2 or 3, a more complicated 
defining equation is needed (see Silverman [93, Appendix A}). 
Given an elliptic curve E over K , we define E(K) to be the set of solu- 
tions 
E(K) = {(x,y)€ K x K :y” = 4x3 —gox —g3} U {oo}. 


The symbol oo appears because in algebraic geometry, it is best to work 
with homogeneous equations in projective space. Equation (14.1) defines a 
curve in the affine space K*, but in the projective space P?(K ) there is an 
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extra “point at infinity” (see Exercise 14.1 for the details). Given a field 
extension K C L, we can also define E(K) C E(L) in an obvious way. 
Over the complex numbers C, the Weierstrass ¢-function gives us elliptic 
curves as follows. Let L CC be a lattice, and let e(z) = e(z; L) be the 
corresponding g-function. Then we have the differential equation 


p'(2) = 4e(z) — 82(L)p(z) — 83(L) 
of Theorem 10.1, which gives us the elliptic curve 
y? = 4x3 — go(L)x — g3(L). 


If z¢ L, then g(z) and p'(z) are defined, and the differential equation 
shows that (¢(z), @'(z)) is in E(C). Since g(z) and g'(z) are also periodic 
for L, we get a well-defined mapping 

(C — L)/L — E(C)-— {oo}. 


It is easy to show that this map is a bijection (see Exercise 14.2), and con- 
sequently we get a bijection 


(14.2) C/L~ E(C) 


by sending 0 € C to 00 € E(C). Both C/L and E(C) have natural structures 
as Riemann surfaces, and it can be shown that the above map is biholomor- 
phic. 

The unexpected fact is that every elliptic curve over C arises from a 
unique Weierstrass o-function. More precisely, we have the following result: 


Proposition 14.3. Let E be an elliptic curve over C given by the Weierstrass 
equation 
y>=4x>—gox—gs, 82,83 EC, 3 — 2783 #0, 
then there is a unique lattice L C C such that 
f2= 82(L) 
&3 = §3(L). 


Proof. The existence of L was proved in Corollary 11.7, and the unique- 
ness follows from the from the proof of Theorem 10.9 (see Exercise 14.3). 
Q.E.D. 


Proposition 14.3 is often called uniformization theorem for elliptic 
curves. Note that it is a consequence of the properties of the j-function. 
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The mention of the j-function prompts our next definition: if an elliptic 
curve E over a field K is defined by the Weierstrass equation (14.1), then 
the j-invariant j(£) is defined to be the number 


j(E) = 172. = ins? cK. 

8} — 2783 A 
Note that j(£) is well-defined since A # 0, and the factor of 1728 doesn’t 
cause trouble since K has characteristic different from 2 and 3 (the defini- 
tion of the j-invariant is more complicated in the latter case—see Silverman 
[93, Appendix A]). Over the complex numbers, notice that 


j(L) = j(E) 
whenever E is the elliptic curve determined by the lattice L Cc C. 
To define isomorphisms of elliptic curves, let E and E’ be elliptic curves 
over K, defined by Weierstrass equations y* = 4x3—g.x —g, and y?= 


4x3 — gi x — gi respectively. Then E and E’ are isomorphic over K if there 
is a nonzero c € K such that 


83 = C"go 
83 = 0°83. 
In this case, note that the map sending (x,y) to (c?x,c3y) induces a bijec- 
tion 
E(K)~ E'(K). 
It is trivial to check that isomorphic elliptic curves have the same j-invari- 
ant. 


Over the complex numbers, isomorphisms of elliptic curves are related 
to lattices and j-invariants as follows: 


Proposition 14.4. Let E and E' be elliptic curves corresponding to lattices 
L and L' respectively. Then the following statements are equivalent: 


(i) E and E' are isomorphic over C. 
(ii) L and L' are homothetic. 


(iii) {(E) = J(E’). 


Proof. This follows easily from Theorem 10.9. We leave the details to the 
reader (see Exercise 14.4). Q.E.D. 


What is more interesting is that part of this proposition generalizes to 
any algebraically closed field: 


Proposition 14.5. Let E and E' be elliptic curves over a field K. 
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(i) E and E' have the same j-invariant if and only if they are isomorphic 
over a finite extension of K. 

(ii) If K ts algebraically closed, then E and E' have the same j-invariant if 
and only if they are isomorphic over K. 


Proof. The proof is basically a transcription of the algebraic part of the 
proof of Theorem 10.9—see Exercise 14.4. Q.E.D. 


Over nonalgebraically closed fields, nonisomorphic elliptic curves may 
have the same j-invariant (see Exercise 14.4 for an example over Q). Later, 
we will discuss the isomorphism classes of elliptic curves over a finite field. 

Finally, we need to discuss the group structure on an elliptic curve. The 
basic idea is to translate the addition law for the Weierstrass g-function into 
algebraic terms. To see how this works, let E be an elliptic curve over K, 
and let P,; and P, be two points in E(K). Our goal is to define P, + P2 € 
E(K). If P; = 00, we define 


P, + Pz =0o+ Po = Po, 


and the case P2 = oo is treated similarly. Thus oo will be the identity el- 
ement of E(K). For the remaining cases, we may write P; = (x1,y1) and 
Pz = (X2,y2). If x1 # X2, then we define 


P; + P2 = (x3, y3), 


where x3 and y3 are given by 


1 f/f yi-y2 7 
ma-n-m-7(2=%) 


— y¥2 
ya=—yi- (an) (2%), 


(14.6) 


X1— X2 


These formulas come from the addition laws for p(z + w) and g’(z+w) 
(see Theorem 10.1 and Exercise 14.5). 

We still need to consider what happens when x; = x2. In this case, the 
Weierstrass equation implies that y; = t+y2, so that there are two cases to 
consider. When y; = —y2, we define 


P, + P2 = oo. 


This formula tells us that the inverse of (x,y) € E(K) is (x,—y). Finally, 
suppose that P; = P2, where y; = y2 # 0. Here, we define 


Pi + P2=2P, = (x3, y3), 
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where x3 and y3 are given by 


2 
1 (12x? - 


(14.7) 1o\ 
- 12x; — g2 
a ee a ( 2y1 


These formulas come from the duplication laws for @(2z) and ¢'(2z) (see 
(10.13) and Exercise 14.5). The major fact is that we get a group: 


Theorem 14.8. If E is an elliptic curve over a field K, then E(K) is a group 
(with oo as identity) under the binary operation defined above. 


Proof. See Husemdller [58], Koblitz [67] or Silverman [93] for a proof. 
These references also explain a lovely geometric interpretation of the above 
formulas. Q.E.D. 


If E is an elliptic curve over K and K C L is a field extension, then it is 
easy to show that E(K) is a subgroup of E(L). 

Over the complex numbers, we saw in (14.2) that there is a bijection 
C/L~ E(C). Notice that both of these objects are groups: C/L has a natu- 
ral group structure induced by addition of complex numbers, and E(C) has 
the group structure defined in Theorem 14.8. It is immediate that the map 
C/L ~ E(C) is a group isomorphism. 


B. Complex Multiplication and Elliptic Curves 


The next topic to discuss is the complex multiplication of elliptic curves. 
The idea is to take the theory developed in §810 and 11 and translate lat- 
tices into elliptic curves. The crucial step is to get an algebraic description 
of complex multiplication, which can then be used over arbitrary fields. 

Let’s start by describing the endomorphism ring of an elliptic curve E 
over C. Namely, if E corresponds to the lattice L, we define 


Endc(E) = {a€ C:aLc L}. 


This is clearly a subring of C, and note that ZC Endc(£). Then we say 
that E has complex multiplication if Z # Endc(£). From Theorem 10.14, it 
follows that E has complex multiplication if and only if L does, and in this 
case, Endc(£) is an order O in an imaginary quadratic field. 

Given a €Q, the inclusion aL C L gives us a group homomorphism 
a:C/L—C/L. Combined with (14.2), we see that a € Endc(£) induces 
induces a group homomorphism 


a: E(C) = E(C). 
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In terms of the x and y coordinates of a point in E(C), this map can be 
described as follows: 


Proposition 14.9. Given a # 0€ Endc(E), there is a rational function R(x) 
€ C(x) such that for (x,y) € E(C), we have 


(x,y) = (R(x), R(x)y), 
where R'(x) = (d/dx)R(x). 


Proof. Given aL C L, we saw in Theorem 10.14 that there is a rational 
function R(x) such that (az) = R(p(z)). Differentiating with respect to z 
gives p'(az)a = R'(p(z))p'(z), and thus g'(az) = (1/a)R'(p(z))p'(z). 
Since a: E(C)— E(C) comes from a:C/L—C/L via the map zr 
(((z), @'(z)), the proposition follows. Q.E.D. 


Because of the algebraic nature of a € Endc(£), we write a: E > E in- 
stead of a: E(C) — E(C). When a # 0, we say that a is an tsogeny from 
E to itself. The most important invariant of an isogeny is its degree deg(a), 
which is defined to the the order of its kernel. More precisely, if E corre- 
ponds to the lattice L, then it is easy to see that the kernel of a: E(C) > 
E(C) is isomorphic to L/aL (see Exercise 14.6). Thus, by Theorem 10.14, 
it follows that 

deg(a) = |L/aL| = N(a), 


where N(q) is the norm of a € O = Endc(E). 
For an example of complex multiplication, consider the elliptic curve E 
defined by 
y? = 4x3 — 30x — 28. 


We claim that Endc(£) = Z[V—2], and that for (x,y) e€ E(C), complex 
multiplication by /—2 is an isogeny of degree 2 given by the formula 


2x7 +4x4+9 1 2x7+8x-1 
(14.10) V—2(x,y) = (-SS Se ). 


It turns out that the major work of this claim was proved in §10 when we 
considered the lattice L = [1,/—2]. Namely, in the discussion surrounding 
(10.21) and (10.22), we showed that for some A, 
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If we set A’ = ,/3/2A, then it follows that 
g2(A’L) = 30 
83(\'L) a 28, 


which implies that E has complex multiplication by /—2. Furthermore, 
the formula for ¢(./—2z) given in (10.21) and (10.22) easily combines with 
Proposition 14.9 to prove (14.10) (see Exercise 14.7). 

For an elliptic curve E over an arbitrary field K , we can’t use lattices to 
define complex multiplication. But as indicated by Proposition 14.9, there 
is a purely algebraic definition of the endomorphism ring Endx(£) that 
depends only on the defining equation of E (see Silverman [93, Chapter 
III]). Because of the group structure of E, Endx(E£) always contains Z, 
and if K has characteristic zero, we say that E has complex multiplication 
if End;(E£) # Z, where K is the algebraic closure of K (thus the complex 
multiplications may only be defined over finite extensions of K). When K 
is a finite field, we will see below that Endx is always bigger than Z. For 
this reason, the term “complex multiplication” is rarely used when K has 
positive characteristic. 

When K CC, we can describe the endomorphism ring Endx(£) as fol- 
lows. Namely, let a € Endc¢(£), and use Proposition 14.9 to write a(x, y) = 
(R(x), (1/a)R'(x)y) for (x,y) € E(C). Then 


aé Endx(E) <=> R(x), —Ri(x) € K(x). 


Another interesting case is when K = F, is a finite field. Here, the map 
sending (x,y) to (x7,y7) clearly defines a group homomorphism E(L)-— 
E(L) for any field L containing K (see Exercise 14.8). This gives an ele- 
ment Frob, € Endx(£), which is called the Frobenius endomorphism of E. 
It will play an important role later on. Notice that this map is not of the 
form (R(x), (1/a)R'(x)y). 

In this abstract setting, one can still define the degree of an isogeny a # 
0€ Endx(E). When K C C, the degree of a is again the order of ker(a): 
E(C) — E(C), while over a finite field, the degree is more subtle to define. 
For example, the Frobenius isogeny Frob,g always has degree q even though 
Frob,: E(L)— E(L) is injective for any field K C L. See Silverman [93, 
SIII.4] for a precise definition of the degree of an isogeny. 

Besides isogenies from E to itself (which are recorded by Endx(£)), one 
can also define the notion of an isogeny a between different elliptic curves 
E and E"’ over the same field K . For simplicity, we will confine our remarks 
to the case K = C. In this situation, E and E’ correspond to lattices L and 
L’. If a # 0 is a complex number such that aL C L’, then multiplication by 
q@ induces a map 

a: E(C)— E'(C) 
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with kernel L'/aL, and we say that @ is an isogeny from E to E’. As in 
Proposition 14.9, one can show that a@ is essentially algebraic in nature (see 
Exercise 14.9), so that we can write a@ as a: E — E’, and we say that a@ is 
an isogeny from E to E’. 

The notion of isogeny has a close relation to the modular equation. We 
define an isogeny a: E — E' to be cyclic if its kernel L'/a@L is cyclic. Then 
we have: 


Proposition 14.11. Let E and E' be elliptic curves over C. Then there ts 
a cyclic isogeny a from E to E"' of degree m if and only if ®m(j(E), j(E’)) 
= 0. 


Proof. This follows easily from the analysis of ®,,(u,v) = 0 given in Theo- 
rem 11.23 (see Exercise 14.10). Q.E.D. 


For a more complete treatment of these topics, see Lang [73, Chapters 
2 and 5] and Silverman [93, Chapter III]. 


C. Elliptic Curves over Finite Fields 


So far, we’ve translated concepts about lattices into concepts about elliptic 
curves. If this were all that happened, there would be no special reason to 
study elliptic curves. The important point is that the algebraic formulation 
allows us to state some fundamentally new results, the most interesting of 
which involve elliptic curves Over a finite field F,. As usual, we will assume 
that F, has characteristic greater than 3, i.e., q = p*, p > 3. 

When E£ is an elliptic curve over Fg, the group of solutions E(Fq) is a 
finite Abelian group, and it is easy to see that its order |E(F,)| is at most 
2q +1 (see Exercise 14.11). In 1934, Hasse proved the following stronger 
bound conjectured by Artin: 


Theorem 14.12. Jf E ts an elliptic curve over Fg, then 
q+ 1-2VGS|E(Fq)| Sq +14 2G. 


Proof. We will discuss some of the ideas used in the proof. The key ingre- 
dient is the isogeny Frobg € Ende, (F) defined by Frobg(x,y) = (x4, y7). 
We can form the isogeny 1 — Frob,, and it follows easily that if F, is the 
algebraic closure of F,, then 
E(Fq) = ker(1- Frob, : E(Fq) — E(Fq)) 


(see Exercise 14.12). The next step is to show that 1 — Frobg is a separable 
isogeny, which implies that 


(14.13) |E(F,)| = deg(1 - Frob,). 
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From here, the proof is a straightforward consequence of the basic proper- 
ties of isogenies (see Silverman [93, Chapter V, Theorem 1.1]). Q.E.D. 


In 1946, Weil proved a similar result for algebraic curves over finite 
fields, and in 1974, Deligne proved a vast generalization (conjectured by 
Weil) to higher dimensional algebraic varieties. For further discussion and 
references, see Ireland and Rosen [59, Chapter 11] and Silverman [93, §V.2]. 

Elliptic curves over finite fields come in two types, ordinary and super- 
singular, as determined by their endomorphism rings: 


Theorem 14.14. Jf E is an elliptic curve over Fg, then the endomorphism 
ring Ende (£) is either an order in an imaginary quadratc field or an order 
q 


in a quaternion algebra. 


Remarks. 

(i) We say that E is ordinary in the former case and supersingular in the 
latter. 

(ii) Notice that for elliptic curves over a finite field K, Endz(£) is always 
larger than Z. 


Proof. See Silverman [93, Chapter V, Theorem 3.1]. Q.E.D. 


There are many known criteria for E to be supersingular (see Husemol- 
ler (58, p. 258] for an exhaustive list). Over a prime field F,, there is a 
special criterion which will be useful later on: 


Proposition 14.15. Let E be an elliptic curve over Fp. If p > 3, then E is 
supersingular if and only if 


JE(Fp)i = pt. 
Proof. See Silverman (93, Chapter V, Exercise 5.10]. Q.E.D. 


It is interesting to note that |E(F,)| = p+ 1 is the center of the range 
pt+1-2,/p<|E(Fp)| < p+1+2,/p allowed by Hasse’s theorem. 

From the point of view of endomorphisms, ordinary elliptic curves over 
finite fields behave like elliptic curves over C with complex multiplication, 
since in each case, the endomorphism ring is an order in an imaginary 
quadratic field. This suggests a deeper relation between these two classes, 
which leads to our next topic, reduction of elliptic curves. 

The basic idea of reduction is the following. Let K be a number field. 
and let E be an elliptic curve defined by 


y=4-gx-—g3, 92,936 K. 
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If p is prime in Ox, we want to “reduce” E modulo p. This can’t be done 
in general, but suppose that g2 and g3 can be written in the form a/f, 
where a, € Ox and  ¢ p. Then we can define [g2] and [g3] in Ox/p. If, 
in addition, we have 


<= [g2}° _— 27[¢3]" # OE Ox/P, 


then 
y* = 4x3 — [go]x — [gs] 


is an elliptic curve E over the finite field Ox/p. In this case we call E the 
reduction of E modulo p, and we say that E has good reduction modulo p. 

When £ has complex multiplication and good reduction, Deuring, draw- 
ing on examples of Gauss, discovered an astonishing relation between the 
complex multiplication of E and the number of points in E(Ox/p). Rather 
than state his result in its full generality, we will present a version that con- 
cems only elliptic curves over the prime field F,. 

To set up the situation, let O be an order in an imaginary quadratic field 
K, and let L be the ring class field of O. Let p be a prime in Z which 
splits completely in L, and we will fix a prime $8 of L lying above p, so 
that O, /B ~ F >. Finally, let E be an elliptic curve over L which has good 
reduction at $8. With these hypotheses, the reduction E is an elliptic curve 
over F,. Then we have the following theorem: 


Theorem 14.16. Let O, L, p and $B be as above, and let E be an elliptic 
curve over L with Endc(£) = O. If E has good reduction modulo $B, then 
there is m € O such that p = 1% and 


|E(F >) =p+1—(m+7). 


Furthermore, EndF (E) = O, and every elliptic curve over F , with endomor- 
phism ring (over F p) equal to O arises in this way. 


Proof. The basic idea is that when the above hypotheses are fulfilled, re- 
duction induces an isomorphism 


Ende (E) —> End (E) 


that preserves degrees. The proof of this fact is well beyond the scope of 
this book (see Lang [73, Chapter 13, Theorem 12]). 

From the above isomorphism, it follows that there is some 7 € Endc(E£) 
which reduces to Frobp € Ende. (E). Since Frob, has degree p, so does 
ma. Over the complex numbers, we know that the degree of TE O= 
Endc(£) is just its norm, so that N(a)= p. Thus we can write p = 17% 
in O. 
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It is now trivial to compute the number of points on EF. As we noted in 
(14.13), 7 
|E(F ,)| = deg(1 — Frob,). 


Since the reduction map preserves degrees, it follows that 
deg(1 — Frob,) = deg(1—7) = N(1—7) = (1-7)(1-7) 
=pt+1-(1+7) 


since p = 17. This proves the desired formula for |E(F,)|. 
For a proof of the final part of the theorem, see Lang [73, Chapter 13, 
Theorems 13 and 14]. Q.E.D. 


The remarkable fact is that we’ve already seen two examples of this theo- 
rem. First, in (4.24), we stated the following result of Gauss: if p = 1 mod 3 
is prime, then 


If 4p = q* + 27b* and a=1 mod 3, then N = p+a-—2, where 
(14.17) 


N is the number of solutions modulo p of x3 — y>=1 mod p. 


We can relate this to Deuring’s theorem as follows. The coordinate change 
(x,y) (3x/(1+ y),9(1— y)/(1+ y)) transforms the curve x? = y? + 1 into 
the elliptic curve E defined by y? = 4x3 — 27 (see Exercise 14.13). Gauss 
didn’t count the three points at infinity that lie on x3 = y3+1, and when 
these are taken into account, then (14.17) asserts that |E(F,)| = p+1+ta. 
Since p = 1 mod 3, we can write p = 77 in Z[w], w = e2ni/3. In §4, we saw 
that 7 may be chosen to be primary, which means 7 = +1 mod 3. Thus 
we may assume 7 = 1 mod 3, so that 7 = A+ 3Bw, A=1mod p. Then an 
easy Calculation shows that 


4p = (—(2A —3B)) + 27B”. 


Since 2A —-3B =1+7 and —(2A —3B)=1 mod 3, it follows that (14.17) 
may be stated as follows: 


If p = 1% in Z[w] and 7 = 1 mod 3, then |E(F,)| = p+1—(47 +7). 


Since E is the reduction of y* = 4x3 — 27, which has complex multiplication 
by Z[w] (see Exercise 14.13), Gauss’s observation (14.17) really is a special 
case of Deuring’s theorem. 

Similarly, one can check that Gauss’s last diary entry, which concerned 
the number of solutions of x? + y? + x*y? = 1 mod p, is also a special case 
of Deuring’s theorem. See the discussion following (4.24) and Exercise 
14.14. 

As a application of Deuring’s theorem, we can give a formula for the 
number of elliptic curves over F, which have a preassigned number of 
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points. We first need some notation. Given an order O in an imaginary 
quadratic field K, we define the Hurwitz class number H(Q) to be the 
weighted sum of class numbers 


2 
H(O) = —_h(0'). 
OCO'COK 
We also write H(O) as H(D), where D is the discriminant of O. Then we 
have the following theorem of Deuring: 


Theorem 14.18. Let p > 3 be prime, and let N = p+1-a be an integer, 
where —2,/p <a<2,/p. Then the number of elliptic curves E over Fp 
which have |E(F,)| = N = p+ 1-ais 


a 
P——H(a? - 4p). 


Proof. Let m be a root of x*—ax +p. Since —2\/p <a < 2,/p, the qua- 
dratic formula shows that O, = Z[m] is an order in an imaginary quadratic 
field K. One can also check that p doesn’t divide the conductor of Og (in 
fact, it doesn’t divide the discriminant), and hence the same is true for any 
order O’ containing Og (see Exercise 14.15). 

We will start with the case a # 0, which by Proposition 14.15 means that 
all of the elliptic curves involved are ordinary. Given an order ©’ containing 
O, and a proper O’-ideal a, we wil] produce a collection of elliptic curves 
E, with good reduction modulo p. Namely, let L’ be the ring class field 
of O'. Since p = 27 in O, C O’, it follows from Theorem 9.4 that p splits 
completely in L’. Thus, if $8 is any prime of L’ containing p, then OO, /$P 
ol ee 

First, assume that O' # Z(i) or Z[w], w = e?™/3, and which implies that 
j(a) # 0,1728. If we let 

27j(a) 


~ j(a)— 1728’ 


then we define the collection of elliptic curves E, over L’ by the Weier- 
strass equations 
y* = 4x3 —ke*x —ke?, 


where c € O, — is arbitrary. A computation shows that j(E) = j(a). We 
can reduce k modulo L provided that j(a)— 1728 ¢ $B. Since 1728 = j(i), 
Theorem 13.21 implies that 

jJ(a) = 1728 mod $8 => p does not split in K or Q(z) 


(when K = Q(z), note that the conductor condition of Theorem 13.21 is 
satisfied). However, p splits in K, and thus j(a)—1728¢ $B, as desired. 
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Then one computes that in O,' /P~ Fp, 


27 j(ay 
A = [ke*} — 27[kc7P = 1728[c° oa 
[ c*} [ c"| 28[c"] (j(a) — 1728) 

By the argument used to prove j(a) — 1728 ¢ 8, Theorem 13.21 and j(w) = 
0 show that j(a) ¢ $B. It follows that E, has good reduction modulo $P since 
cé PB. 

If O' = Z[i] or Z[w], then L’ = K. Here, we will use the collection of 
elliptic curves E, defined by 


y? = 4x3 — cx, c¢é nZ{[i] 
y? Sj 4x Sc, c¢é mZ[w]. 


One easily checks that these curves have good reduction modulo m and 0’ 
as their endomorphism ring. 

Theorem 14.16 assures us that every ordinary elliptic curve E over F, 
arises from reduction of some elliptic curve with complex multiplication. 
Given this, it follows without difficulty that E is in fact the reduction of one 
of the E,’s constructed above (see Exercise 14.16). 

Given O’, there are h(O’') distinct j-invariants j(a), and hence for a 


fixed a, we have 
d, WO’) 
0,CO! 


distinct collections of elliptic curves E,. Furthermore, another application 
of Theorem 13.21 shows that different collections reduce to curves with 
different j-invariants. Since each collection E, gives us p—1 curves over 
Fp, we get 


(p-1) db) AO’) 


O,CO!' 


elliptic curves over F,,. But which of these have p + 1—a points on them? 
The problem is that Theorem 14.16 implies that |E-(F,)| is determined by 
some element of O’ of norm p, but it need not be 7! All curves in a given 
collection have the same j-invariant, but they need not be isomorphic over 
F,, and hence they may have different numbers of points. In fact, this is 
always the case: 


Proposition 14.19. Let E and E' be elliptic curves over F p. If E ts ordi- 
nary, then E and E' are isomorphic over Fp uf and only if j(E) = j(E') and 
JE(Fp)| = |E'(Fp)I- 


Proof. One direction of the proof is obvious, but the other requires some 
more advanced concepts. We will give the details since this result doesn’t 
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appear in standard references. The key ingredient is a theorem of Tate, 
which asserts that curves with the same number of points over a finite 
field K are isogenous over K (see Husemdller [58, §13.8]). Applying this 
to |E(F,)| = |E'(F>)|, we get an isogeny 1: E> E’ defined over F,. Re- 
placing A by 1—A if necessary, we may assume that A is separable. Since 
E and E’ have the same j-invariant, we can also find an isomorphism 
gp: E' + E defined over some extension Fp« (see Proposition 14.5). Thus 
PordE Ende, (£). Since E is ordinary, the endomorphism ring is an or- 
der in an imaginary quadratic field, and it follows that Z[Frob,] has finite 
index in End, (£). In Exercise 14.15, we saw that p does not divide the 
conductor of Z[m] ~ Z[Frobp], and it follows that p does not divide the 
index m of Z[Frobp] Cc Ende (£). Thus mdo XA € Z[Frobp], which implies 
that mpor = do m2 is defined over F,. Since mA is separable, the stan- 
dard properties of isogenies imply that ¢ is defined over F, (see Silverman 
[93, Chapter III, Corollary 4.11}). Q.E.D. 


We claim that the collection E, contains exactly (p —1)/|O’*| curves 
with p+1-a points. This will immediately imply our desired formula. 
Let’s first consider the case when FE, corresponds to a j-invariant j(a) 4 0 
or 1728. Here, the only solutions of N(a) = p in O' are a=+47 and +7 
(see Exercise 14.17). Thus, for each c, Deuring’s theorem tells us that 


|E(Fp)| =ptil+a. 


However, the curves E, fall into two isomorphism classes, each consisting 
of (p — 1)/2 curves, corresponding to whether [c] € ES or not (see Exercise 
14.18). By the above proposition, nonisomorphic curves have a different 
number of elements, and hence we see that exactly half of the E,’s have p + 
1—a elements. Since O'* = {+1}, we get (p — 1)/2 = (p—1)/|O""| curves 
with p + 1-—a points. 

When j(a) = 1728, things are a bit more complicated. Here, O' = Z[i], 
and p = 77% implies that p = 1 mod 4. The only solutions of N(a@) = p are 
a=+7,+7,+in, and tin (see Exercise 14.17), and thus there are at most 
four possibilities for |E,(F,)|. But there are four isomorphism classes of 
curves with j = 1728 in this case, each consisting of (p — 1)/4 curves (see 
Exercise 14.18). It follows that there are exactly (p — 1)/|O’"| curves with 
p+1-a points. The case j = 0 is similar and is left to the reader. 

It remains to study the case a = 0, which concerns the number of super- 
singular curves over F,. Since Theorem 14.16 doesn’t apply to this case, 
we will take a more indirect approach. Given any a in the range 2,/p < 
a < 2,/p, we just proved that when a # 0, there are (p — 1)/2H(a? — 4p) 
elliptic curves over F, with p + 1—a points. Let SS denote the number of 
supersingular curves. Since there are p(p — 1) elliptic curves over F, (see 
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Exercise 14.19), it follows that 


(14.20) pip-l=ss+ So FO 5 P~ 1 i7(q2 — 4p), 
0<|a|<2./p 


However, we claim that there is a class number formula 


(14.21) 2p= > H(a?—4p). 
0<|a|<2VP 


Since (14.20) and (14.21) imply that SS = (p — 1)/2H(—4p), we need only 
prove (14.21). 

To prove this, note that H(a* — 4p) = H(Q,), so that by definition, the 
right-hand side of (14.21) equals 


> a tO) 


0<|a|<2V— Oz Tor | 


If we define the function y(a) by 


(a) 1 if O, C O' 
a)j= 
A 0 otherwise, 


then the above sum can be written as 


x oF S> x(a) AO’). 


o 0<|al<2Vp 


It is easy to prove that the quantity in parentheses is r(O’, p), which we 
defined in §13 to be |{a € O': N(m) = p}/O""| (see Exercise 14.20). Thus 
the right-hand side of (14.21) becomes 


DO", p)n(o’). 

o' 
In Corollary 13.9 we proved that this quantity equals 2p, and (14.21) is 
proved. This completes the proof of Theorem 14.18. Q.E.D. 


Recall that Corollary 13.9 was part of our study of the polynomial 
®,(X,X). It is rather unexpected that the modular equation has a connec- 
tion with supersingular curves over F,. This is just more evidence of the 
amazing richness of the study of elliptic curves. To pursue these topics fur- 
ther, the reader should consult Lang [73] and Shimura [90]. Also, see the 
monographs by Cassou-Nogués and ‘Taylor [15] and by Gross [45] for an 
introduction to some of the current research concerning elliptic curves and 
complex multiplication. 
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D. Elliptic Curve Primality Tests 


In the past few years, there have been some surprising applications of el- 
liptic curves to problems involving factoring and primality. In 1985, Lenstra 
announced an elliptic curve factoring method [76], and a year later, Gold- 
wasser and Kilian adapted Lenstra’s method to obtain an elliptic curve pri- 
mality test [43]. Both methods use the properties of elliptic curves over 
finite fields. We will concentrate on the Goldwasser—Kilian Test and its re- 
cent variation, the Goldwasser—Kilian—Atkin Test. This last test is especially 
interesting, for it uses the class equations studied in §13. Thus, the polyno- 
mial H_4,(X), which appears in our critierion for when p is of the form 
x’ + ny”, can actually be used to prove that p is prime! Our treatment of 
these tests will not be complete, and for further details, we refer the reader 
to the articles by Goldwasser and Kilian [43], Lenstra [76] and Morain [79]. 

Given a potential prime /, the goal of these tests is to prove the primality 
of | by considering elliptic curves over the field Z/1Z. Since we don’t know 
that / is prime, we must treat Z//Z as a ring, and thus we need a theory of 
elliptic curves over rings. Fortunately, the basic ideas carry over quite easily. 
Let R be any commutative ring with identity where 2 and 3 are units. Then 
an elliptic curve E over R is a Weierstrass equation of the usual form 


y?=4x3-—gox—g3, 82,83 E R, 


where we now require that 


(14.22) A = g} —27g2 € R*. 

Note that since A is a unit in R, the j-invariant 
g3 

j(E)= 1728 ER 


is defined. 
Given an elliptic curve E over R, we set 


Eo(R) = {(x,y)€ Rx R:y* = 4x3 — gox — g3} U {oo}. 


The reason for the new notation is that E9(R) may fail to be a group! To 
see this, consider P, = (x1, y2) and P2 = (x2, y2) in Eo(R). If x1 # x2, then 
we would like to define 

Pi + P2 = (x3, y3), 


where x3 and y3 are given by the formulas (14.6). The problem comes from 
the denominator x; — x2: it is nonzero in R, but it need not be invertible! 
For this reason, the binary operation is only partially defined on Eo(R). 
Using tools from algebraic geometry, one can define a superset E(R) of 
F(R) which is a group, but we prefer to use Eo(R) because it is easier to 
work with in practice. 
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If E is an elliptic curve over Z//Z, the potentially incomplete group 
structure on Eo(Z//Z) is not a problem. Namely, if we ever found P; and 
P, in Eo(Z/1Z) such that P; + P2, wasn’t defined, then it would follow au- 
tomatically that / must be composite, and the noninvertible denominator 
would give us factor of / (just compute the appropriate gcd). This observa- 
tion is the driving force of Lenstra’s elliptic curve factoring algorithm (see 
[76}). 

Before discussing the Goldwasser-Kilian Test, let’s review some basic 
ideas concerning primality testing. We regard 7 as an input of Jength 
[log,)/], where [ ] is the greatest integer function. The length is thus 
bounded by a constant times In/, which we express by writing [log,)/] = 
O(in/). The most interesting question concerning a primality test is its run- 
ning time: given an input /, how Jong, as a function of In/, does it take a 
given algorithm to prove that / is (or is not) prime? The simplest algorithm 
(divide by all numbers < V7) requires 


V1 = eft/Dinl 


divisions, and hence runs in exponential time. What we really want is a algo- 
rithm that runs in polynomial time, i.e., where is running time is O((In/)“) 
for some fixed d. Right now, no polynomial time algorithm is known, al- 
though there is a candidate for one—see Wagon [99] for further details. 

Another sort of algorithm commonly used is what is called a probabilis- 
tic primality test. Such a test has two outputs, “prime” and “composite or 
unluckily prime.” In the former case, the program proves the primeness of 
1, while in the latter case, it says either that / is composite or that / is prime 
and we were unlucky. A nice discussion of probabilistic primality tests may 
be found in Wagon’s article [99]. For our purposes, we will explain this con- 
cept by considering the following very special probabilistic primality test. 

Let / be our potential prime, relatively prime to 6, and suppose that we 
have an elliptic curve E over Z//Z with the following two properties: 


(i) 1+1-2V1 < |E,(Z/1Z)| <1 +1+2VvI1. 
(ii) |Eo(Z/1Z)| = 2q, where q is an odd prime. 
In certain situations, this setup can be used to prove primality: 


Lemma 14.23. Let 1 and E be as above, and assume I > 13. Let P # co be 
in Eo(Z/1Z). If qP is defined and equal to co in Eo(Z/1Z), then I is prime. 


Proof. Assume that / is not prime, and let p < V/ be a prime divisor of /. 
Using the natural map Z//Z — Z/pZ =F p, we can reduce the equation of 
E modulo p, and by (14.22), we get an elliptic curve E over Fp. Further- 
more, we get a natural map 


E)(Z/1Z) — E(Fp) 
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which takes P = (x,y) # 00 in E(Z/IZ) to P = (%,Y) # 00 in E(F,). Since 
this map is also clearly a homomorphism (wherever defined), it follows that 
qP =oo in E(F,). But q is prime, so that P is a point of order qg, and 
hence 

q SIE p< p+1+2,/p, 


where the second inequality comes from Hasse’s theorem (Theorem 14.12). 
Since p < V1, this implies that 


q< Vl+14+2V1 =(W1 41). 
However, by assumption, we have 
2q = |Eo(Z/1Z)| > 1 +1-2V1 =(v1- 1). 
Combining these two inequalities, we obtain 
V1-1< V2(v1 + 1), 


which is easily seen to be impossible for / > 13. This contradiction proves 
the Jemma. Q.E.D. 


To convert this lemma into a probabilistic primality test, we need one 
more observation. Namely, if / is prime and |£o(Z//Z)| =2q, q an odd 
prime, then Eo(Z//Z) must be a cyclic group, and hence exactly q —1 of 
the 2q —1 nonidentity elements have order g. Thus, the probability that 
a randomly chosen P # oo doesn’t prove primality (i.e., has order # q) is 
q/(2q —1)~ 1/2, assuming that q is large. 

Now we can state the test. Given E and / be as above, pick k randomly 
chosen points P;,..., Px from Eo(Z//Z), and then compute qPj,...,qP,. If 
any one of these is defined and equals oo, then by the above Jemma, we 
have a proof of primality. If none of qPi,...,qP, satisfy this condition, then 
either / is composite, or / is prime and we were unlucky. To see how un- 
lucky, suppose that / were prime. Then our test fails only if all of P;,..., Px 
have order # q. By the above paragraph, the probability of this happen- 


ing 1s 
Ems ore 
2g-1/ ~— 2 


So we can’t guarantee a proof of primality, but we have to be mighty un- 
lucky not to find one. 

This test depended on the assumptions (i) and (ii) above. The first as- 
sumption is quite reasonable, since by Hasse’s theorem it holds if / is prime. 
So if (i) fails, we have a proof of compositeness. But the second assump- 
tion, that |Eo(Z//Z)| is twice a prime, is a very special, and certainly fails 
for most elliptic curves. An added difficulty is that |Eo(Z//Z)| is a very 
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large number (by (i), it has the same order of magnitude as /). Thus, even 
if |Eo(Z/1Z)| = 2q were twice a prime, we’d be unlikely to know it, since 
we'd have to prove that g, a number roughly the size of //2, is also prime. 

To overcome these problems, Goldwasser and Kilian used two ideas. The 
first idea is quite simple: 


Choose lots of elliptic curves E over Z//Z at random. 
(14.24) If we get one where |Eo(Z//Z)| = 2q, q a probable prime, 
then use the above special test to check for primality. 


Notice the word “probable.” Using known probabilistic compositeness tests 
(described in Wagon [99]), one can efficiently reduce to the case where 
|Eo(Z/1Z)| is of the form 2q, where q is probably prime. If the special test 
succeeds, we have proved that / is prime, provided that g is prime. Then 
the second idea is 


(14.25) Make the above process recursive. 


This means proving q is prime by applying the special test to an elliptic 
curve over Z/qZ of order 2q’, g' a probable prime. In this way the primality 
of q' implies the primality of g. Since each iteration reduces the size by a 
factor of 2 (i.e., g is about the size of //2, q’ is about the size of q/2, 
etc.), it follows that in O(In/) steps the numbers will get smal] enough that 
primality can be verified easily. 

The algorithm contained in (14.24) and (14.25) is the heart of the Gold- 
wasser-Kilian primality test (see their article [43] for a fuller discussion). 
The key unanswered question concerns (14.24): when / is prime, how many 
elliptic curves do we have to choose before finding one where |E(Z//Z)| is 
twice a prime? The following result of Lenstra plays a crucial role: 


Theorem 14.26. Let 1 be a prime, and let 
S = {2q:q prime, 1+1—-V1<2q <14+14+ Vi}. 


Then there is a constant c, > 0, independent of | and S, such that the number 
of elliptic curves E over F; satisfying |E(F1)| € S is at least 


. (S|-2Vvid-1) 
: In/ 


Remark. Notice that the elliptic curves described in this theorem satisfy 
1+1-V1<|E(F,)| <1 +1+ V1, which is more restrictive than the bound 
given by Hasse’s theorem. The proof below wil] explain the reason for this. 


Proof. Given 2q € S, write 2g =1+1-—a. Then we proved in Theorem 
14.18 that the number of curves with 2g =/+1-a points is (/ —1)/2 
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x H(a* — 41), where H(a? — 41) is the Hurwitz class number defined earlier. 
Using classically known bounds on class numbers, Lenstra proved in [76] 
that for 2g € S, with at most two exceptions, there is the estimate 


|a2 — 4]| 
In/ 
where c is a constant independent of the discriminant (see [76, Proposition 


1.8]). We are assuming that |a| < V7, which implies \/|a? — 4/| > V3/, and 
consequently 


H(a? —4l)>c- 


cs eee V1(1 — 1) 
ie oa ae aces Nes Se 
5 H(a* —4l)> cy lal? 


where c; = V3c/2. The theorem follows immediately. Q.E.D. 


By this theorem, we are reduced to knowing the number of primes 
in the interval [(/ + 1)/2 — V1/2,(1 + 1)/2+ V1/2]. By the Prime Number 
Theorem, the probability that a number in the interval [0,N] is prime is 
1/inN. It is conjectured that this holds for intervals of shorter length. Ap- 
plied to the above, we get the following conjecture: 


Conjecture 14.27. There is a constant c2>0 such that, for all sufficiently 
large primes 1, the number of primes in the interval [(1 + 1)/2- V1/2, 


(J +1)/2+ V1/2] is at least 
v1 


ae OF 


If this conjecture were true, then Theorem 14.26 would imply that when / 
is large, there is a constant c3, independent of /, such that at least 
“ I(/ - 1) 
3°" (inl? 


elliptic curves E over Z/IZ have order |E(F;)| = 2q for some prime q (see 
Exercise 14.21). Since there are /(/ — 1) elliptic curves over F,, it follows 
that there is a probability of at least 


(14.28) c3/(Inl)? 


that |E(F,)| has the desired order. 

Now we can explain how many curves need to be chosen in (14.24). 
Namely, pick an integer k, and pick k(in/)*/c3 randomly chosen elliptic 
curves over Z//Z. If / were prime, could al] these curves fail] to have order 
twice a prime? By (14.28), the probability of this happening is Jess than 


: C3 aun ae 1 
(In/)? eke 


328 $14. ELLIPTIC CURVES 


It remains to give a run time analysis of the Goldwasser—Kilian ‘Test. 
For (14.24), we need to pick O((In/)’) curves and count the number of 
points on each one. By an algorithm of Schoof (see Morain [79, §5.5]), 
it takes O((In/)*) to count the points on each curve. Once a curve with 
|Eo(Z/1Z)| = 2q is found, we then need to pick points P € Eo(Z//Z) and 
compute gP. These operations are bounded by O((In/)®) (see Goldwasser 
and Kilian [43, §4.3]), and thus the run time of (14.24) is O((In/)!°). By 
(14.25), we have to iterate this O(In/) times, so that the run time of the 
whole algorithm is O((In/)!*). 

The above analysis is predicated on Conjecture 14.27, which may be very 
difficult to prove (or even false!). But now comes the final ingredient: using 
known results about the distribution of primes, Goldwasser and Kilian were 
able to prove that their algorithm terminates with a run time of O(k1) for 


at least 
kil Inink 


(1- 0(27 )) x 100 

percent of the prime inputs of length k (see [43, Theorem 3]). Thus the 
Goldwasser-—Kilian Test is almost a polynomial time probabilistic primality 
test! 

In practice, the implementation of the Goldwasser-Kilian Test is more 
complicated than the algorithm sketched above. The main difference is that 
the order |£o(Z//Z)| is allowed to be of the form mq, where m may be 
bigger than 2 but is still small compared to q. This means that fewer elliptic 
curves must be tried before finding a suitable one, and thus the algorithm 
runs faster. For the details of how this is done, see Goldwasser and Kilian 
[43, §4.4] or Morain [79, §§2.2.2 and 7.7]. 

The most “expensive” part of the Goldwasser-Kilian Test is the O((In/)*) 
spent counting the points on a given elliptic curve. So rather than starting 
with E and then computing |Eo(Z//Z)| the hard way, why not use the theory 
developed earlier to predict the order? This is the basis of the Goldwasser— 
Kilian—Atkin Test, which we will discuss next. 

The wonderful thing about this test is that it brings us back to our topic 
of primes of the form x* + ny”. To see why, let / be a prime, and let n be 
a positive integer such that / can be written as 


l=a*+nb’, a,beZ. 


We will use this information to produce an elliptic curve over F; with / + 
1— 2a points on it. The basic idea is to use the characterization of primes 
of the form x” + ny” proved in §13: 


(—n/l)=1 and H_4,(X)=0 mod / 


l=x*+ny?=> . 
has an integer solution, 
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where H_,4,(X) is the class equation for discriminant —4n (see the proof 
of Theorem 13.23). Thus / = a? + nb? gives us a solution j of the con- 
gruence H_4,(X )=0 mod p, and for simplicity, we will suppose that j # 
0,1728 mod /. Define k € F; to be the congruence class 


p= | 
Jj — 1728 
and then consider the two elliptic curves 
y? =4x3 —kx-k 


(14.29) 
y? =4x° -—c?kx—cPk, 


where c € F; is a nonsquare. We have the following result: 


Proposition 14.30. Of the two elliptic curves over F, defined in (14.29), one 
has order | + 1— 2a, and the other has order 1 + 1+ 2a, where | = a® + nb*. 


Proof. Let L be the ring class field of the order O = Z[/—n], and let 
H_4n(X) = TX — j(a;)) be the class equation. If 8 is prime in L, then 
the isomorphism 0, / ~ Z/1Z =F shows that our solution j of H_4,(X) 
= 0 mod/ satisfies j = j(a;) mod 8 for some 7. It follows that the curves 
(14.29) are members of the corresponding collection E, constructed in the 
proof of Theorem 14.18, and our proposition then follows immediately since 
| = nm in O, where 7 =a+b/—n. Q.E.D. 


The curves (14.29) don’t make sense when j =0,1728 mod/, but the 
proof of Theorem 14.18 makes it clear how to proceed in these cases. 

We can now sketch the Goldwasser—Kilian—Atkin Test. Given a potential 
prime /, one searches for the smallest n with / of the form a* + nb”. Once 
we succeed, we check if either / + 1+2a is twice a probable prime q. If 
not, we look for the next 1 with / = a? + nb*. We continue this until / + 1+ 
2a has the right form, and then we apply the special] primality test embodied 
in Lemma 14.23, using the two curves given in (14.29). In this way, we can 
prove that / is prime, provided that qg is prime. Then, as in the regular 
Goldwasser—Kilian Test, we make the whole process recursive. 

In practice, the implementation of the Goldwasser—Kilian—Atkin Test im- 
proves the run time by allowing the order / + 1+42a to be more compli- 
cated than just twice a prime. The complete description of an implementa- 
tion can be found in Morain’s article [79]. 

For our purposes, this test is wonderful because it relates so nicely to 
our problem of when a prime is of the form x? +ny7?. But from a prac- 
tical point of view, the situation is less than ideal, for the test requires 
knowing H_4,(X), a polynomial with notoriously large coefficients. So in 
implementing the Goldwasser—Kilian—Atkin Test, one of the main goals is 
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to avoid computing the full class equation. Different authors have taken dif- 
ferent approaches to this problem, but the basic idea in each case is to use 
the Weber functions f(T), f,(7) and f,(7) from §12. In [79, 86.2], Morain 
uses formulas of Weber, such as the one quoted in §12 


j(V—105)° = V2. (1+ V3)3(1 + V5)3(V3 + V7)3(V5S + V7), 


to determine a root of H_4,(X) modulo / when n is one of Euler’s conve- 
nient numbers (as defined in §3). Another approach, suggested by Kaltofen, 
Valente and Yui [64], is to use the methods of Kaltofen and Yui [65] to 
compute the minimal polynomials of the Weber functions. To see the po- 
tential savings, consider the case n = 14. We proved in $12 that 


§(Vaiy = V24+1+V2V2-1 


V2 
i(/—14 = (5 Vara + 16 
IC =O = f,(V—14)8 


It is clear which one has the simpler minimal polynomial]! The papers by 
Kaltofen, Valente and Yui [64] and Morain [79] give more details on the 
various implementions of the Goldwasser—Kilian—Atkin Test. 

Primality testing is a good place to end this book, for primes are the ba- 
sis of al] number theory. We started in §1 with concrete questions concern- 
ing p= x? +y?, x2 +2y? and x? + 3y”, and followed the general question 
of x? + ny? through various wonderful areas of number theory. The theory 
of §8 was rather abstract, and even the ring class fields of §9 were not very 
intuitive. Complex multiplication helped bring these ideas down to earth, 
and now elliptic curves bring us back to the question of proving that a 
given number is prime. Fermat and Euler would have loved it. 


E. Exercises 


14.1. Let K be a field, and let P?(K) be the projective plane over K, 

which is the set K?— {0}/~, where we set (Ax,Ay,Az) ~ (x,y,Z,) 

for al] A€ K*. 

(a) Show that the map (x,y) (x, y,1) defines an injection K? > 
p? (K) and that the complement p? (K)—K? consists of those 
points with z = 0 (this is called the line at infinity). 

(b) Given an elliptic curve E over K defined by the Weierstrass 
equation y* = 4x? — g)x — 3, we form the equation 


y?z = 4x — goxz? — g32°, 


14.2. 


14.3. 
14.4. 


14.5. 


14.6. 


14.7. 
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which is a homogeneous equation of degree 3. Then we define 
E(K) = {(x,y,z)€ P?(K) yz = 4x? — goxz? — g37°7}. 
To relate this to E(K), show that 
E(K) = {(x,y,1) € P°(K): y? = 4x3 — go x — g3} U {0,1,0}. 

Thus the projective solutions consist of the solutions of the 
affine equation together with one point at infinity, (0,1,0). This 
is the point denoted oo in the text. 

Let LC C be a lattice, and let y* = 4x3 — go(L)x — g3(L) be the cor- 


responding elliptic curve. Then show that the map z+ ((z), 9'(z)) 
induces a bijection 


(C—L)/L —> E(C)— {oo}. 


Hint: use Lemma 10.4 and part (b) of Exercise 10.14. Note also that 
g'(z) is an odd function. 


Prove Proposition 14.3. 


In this exercise we wil] study elliptic curves with the same j-invari- 

ant. 

(a) Prove Propositions 14.4 and 14.5. 

(b) Consider the elliptic curves y? = 4x3 — g3, where g3 is any non- 
zero integer. These curves al] have j-invariant 0, so that they are 
all isomorphic over C. Show that over Q, these curves break up 
into infinitely many isomorphism classes. 


In this exercise we will study the addition and duplication laws of 

p'(2). 

(a) Use formula (14.6) and the addition Jaw for o(z + w) (see Theo- 
rem 10.1) to conjecture and prove an addition Jaw for o'(z + w). 

(b) Use formula (14.7) and the duplication Jaw for ¢(2z) (see 10.13) 
to conjecture and prove a duplication law for '(2z). 


If L and L’ are Jattices and aL Cc L’, where a #0, show that the 
kernel of the map a:C/L— C/L' is isomorphic to L’/aL. 


Complete the proof (begun in (14.10)) that 

£9) 2x7+4x4+9 1 2x7+8x-1 
-—- ee 
(xy A(x+2)’ Jad A(x+2p ? 


defines the isogeny of y? = 4x? — 30x — 28 given by complex multi- 
plication by /—2. Hint: use the discussion surrounding (10.21) and 
(10.22). 
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14.8. 


14.9. 


14.10. 
14.11. 


14.12. 


14.13. 


14.14. 
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Let E be an elliptic curve the finite field F,, and for any extension 
F, CL, define Frobg : E(L)— E(L) by Frobg(x,y) = (x4, y7). 
(a) Show that Frob, is a group homomorphism. 


(b) Show that Frob, is not of the form (R(x),(1/a)R'(x)y) for any 
rational] function R(x). 


Formulate and prove a version of Proposition 14.9 that applies to 
lattices L and L’ such that aL Cc L' for some a € C*. 


Use Theorem 11.23 to prove Proposition 14.11. 


If E is an elliptic curve over F,, then prove that |E(F,)| < 2q +1. 
Hint: given x, how many y’s can satisfy y? = 4x3 — gox — 93? 


If E is an elliptic curve over F,, then show that 
E(Fq) = ker(Frob, : E(Fq) > E(Fq)), 


where Fy is the algebraic closure of F,. Hint: for x € Fg, recall 
that x € F, if and only if x7 = x. 


This exercise is concerned with the relation between Gauss’s claim 

(14.17) and Theorem 14.16. 

(a) Verify that the transformation (x,y) (3x/(1+y),911-y)/ 
(1+y)) takes the curve x* = y*+1 into the elliptic curve E 
defined by y? = 4x3 — 27. 

(b) The projective version of (a) is given by (x,y,z) (3x,9(1- 
y),1+y). Check that (0,-1) on x? = y?+1 is the only point 
that maps to oo = (0,1,0) on E. 

(c) Check that x* = y>+1 has three points at infinity. Hint: re- 
member that p = 1 mod 3. 

(d) Show that E has complex multiplication by Z[w], w = e?7/3, 
Hint: see Exercise 10.17. 


The Jast entry in Gauss’ mathematical diary states that for a prime 
p=1mod p, 
If p =a’ +b’ and a+ bi is primary, then N = p — 2a —3, 
where N is the number of solutions modulo p of 
x* + y?4x7y? =1 mod p. 


Show that this is a special case of Theorem 14.16. Hint: use the 
change of variables (x,y) ((1+ x)/2(1—x),(1 + x”)y/(1— x)*) 
to transform the curve x* + y* + x*y? =1 into the elliptic curve 
y? = 4x3 +x. See the discussion surrounding (4.24) for more de- 
tails and references. 


14.15. 


14.16. 


14.17. 


14.18. 


14.19. 
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Prove that p does not divide the discriminant of the order O, de- 
fined in the proof of Theorem 14.18. 


Let E be an elliptic curve over a field K, and assume that its /- 
invariant j is different from 0 and 1728. Then define k € K to be 
the number : 

—_ 27) 


~ 7 —1728 


Then show that the Weierstrass equation for E can be written in 
the form 
yo4reckx—ck 


for a unique c € K*. Hint: c = g3/g2. 


Let O be an order in an imaginary quadratic field, and let p be a 
prime not dividing the conductor of O. If 7 € O satisfies N(7) = 
p, then prove that all solutions a€ O of N(q) = p are given by 
a@=en or ef for €€ O*. Hint: this can be proved using unique 
factorization of ideals prime to the conductor (see Exercise 7.26). 


Let E, be one of the collections of elliptic curve over Fp which ap- 

pear in the proof of Theorem 14.18, and let j denote their common 

j-invariant. By Exercise 14.16, note that FE, consists of all elliptic 
curves over F, with this j-invariant. 

(a) If 7 # 0,1728, show that the curves break up into two isomor- 
phism classes, each consisting of (p —1)/2 curves. Hint: con- 
sider the subgroup of squares in F5,. 

(b) If j = 1728 and p = 1 mod 4, then show that there are four iso- 
morphism classes, each consisting of (p — 1)/4 curves. 


(c) If ; =0 and p=1 mod 3, then show that there are six isomor- 
phism classes, each consisting of (p — 1)/6 curves. 


In this exercise, we will sketch two proofs that there are q(q — 1) 

elliptic curves over the finite field F,. As ususal, gq = p*, p > 3. 

(a) Adapt the proof of Exercise 14.16 to show that there are 
q possible j-invariants for elliptic curves over Fz, and show 
that there are gq — 1 curves with a given j-invariant. This gives 
q(q — 1) elliptic curves. 

(b) A second way to prove the formula is to show that there are 
exactly g solutions (g2,23) € Fe of the equation g3 — 27g; = 0. 
We can write this as (g2/3)° = g7, and once we exclude the 
trivial solution (0,0) we need to study solutions of u? = v? in 
the group FZ. So prove the following general fact: if G is a 
finite Abelian group and a and Db are relatively prime integers, 
then the equation u* = v? has exactly |G| solutions in G x G. 
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14.20. 


14.21. 
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Let O’ be an order in an imaginary quadratic field. Given a integer 
m which isn’t a perfect square, show that 


{ae O':N(a@)=m}|=2 JS x(a), 
O<|a|<2Vm 
where x(a) is defined by 


(a) ] if O' contains a root of x*—ax+m 
Xa) = . 
0 otherwise. 


Use Theorem 14.26 to show when Conjecture 14.27 is true, there is 
a constant c3 > 0 such that for all sufficiently large primes /, there 
are at least 

I(1 —1) 


“3° “nly? 


elliptic curves E over F; with |E(F;)| twice a prime. 
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